AO-A047  505 


UNCLASSlFieO 


ILLINOIS  UNIV  AT  URBANA-CHAMPAI5N  COORDINATED  SCIENCE  LAB  F/B  9/2 
AN  INTERPRETER  AND  COMPILER  FOR  AUBMENTED  TRANSITION  NETWORKS. CU) 
JUL  77  T W FININ  N00014-7S-C-0612 

T-9B  ML 


COORDINATED  SCIENCE  LABORATORY 


AN  INTERPtETER  AND 
COMPILER  POR  AUGMENTED 
TRANSITION  NETWORKS 


r 


AN  INTERPRETER  AND  COMPILER  FOR  AUGMENTED  TRANSITION  NETWORKS 


BY 

TIMOTHY  WILKING  FININ 


S.B.  Massachusetts  Institute  of  Technology,  1971 


THESIS 


Submitted  in  partial  fulfillment  of  the  requirements 
for  the  degree  of  Master  of  silane*  rompntiar  Science 
tn  the  Oriduilf  College  of  the 
University  of  Illinios  at  Urbana-Champaign,  1977 


Thesis  Adviser:  Professor  David  L.  Waltz 


Urbana,  Illinois 


ACKNOWLEDGEMENT 


The  work  described  in  this  Thesis  represents  a considerable  volume  of 
complex  programs  and  I have  received  some  assistance  In  constructing 
them.  Tomas  Lozano-Perez  originally  wrote  a simplified  recursive 
version  of  the  ATN  Interpreter  on  which  this  system  Is  based.  Jack 
Holloway  contributed  modified  versions  of  some  of  the  morphological 
analysis  routines  which  were,  In  turn,  based  on  those  described  by 
Woods.  These  have  been  further  modified  and  extended  for  our  own  use. 

Professor  D.L.  Waltz  has  provided  the  encouragement  and  support 
necessary  for  this  undertaking. 


} liViL'i.'i." 


aos-? 


IA£LL£ 


I  Introduction  

1.1  Introduction  

1.2  An  Overview  of  this  thesis 

1.3  Conventions  


II  Augmented  Transition  Networks 

11. 1 Introduction  . 

11. 2 The  ATN  Formalism  . 

11. 3 Design  considerations 

11. 4 Implementation  . 


Ill  The  ATN  Interpreter  

111.1  Invoking  the  ATN  Interpreter  

111. 2 Arc  Types  

111. 2.  a Arcs  which  test  the  current  word  . . . . 

111.2. b  Arcs  which  Modify  the  Flow  of  Control 

111.2. C  The  Virtual  arc  

111.2. d  Miscellaneous  Arcs  

111. 3 Defining  ATN  Networks  and  States  

111. 4 Functions  for  Manipulating  Registers  

111. 4.  a Setting  Registers  

111.4. b  Accessing  Registers  

111.4. C  Testing  the  contents  of  a register  . . 

111. 5 Flow  of  Control  

111. 5.  a Effecting  State  transitions  

111.5. b  Invoking  and  Returning  from  Sub-computations 

III. 5.0  Controlling  Backtracking  and  Failure  . . . 

111.5. d  Suspending  and  Resuming  Computations  . . . 

111.5. e  The  HOLD/VIRTUAL  facility  


V 


111. 6 Useful  Actions 39 

111. 6.  a Actions  which  test  the  current  word  ....  40 

111.6. b  Actions  which  Test  Arbitrary  words  ....  42 

111.6. C  Miscellaneous  Actions  43 

111. 7 Tracing  Facilities 45 

111. 8 Global  variables 47 

IV  The  ATN  Compiler 51 

IV.  1 Introduction 51 

IV. 1. a Compilation  of  a state  53 

IV.I.b  Compilation  of  an  individual  arc  55 

IV.1.C  Optimization  57 

IV.  2 Using  the  ATN  Compiler 6l 

IV. 2. a Compiling  networks  into  LISP  Code 63 

IV. 2. b Compilation  in  the  compiler  63 

IV.  2. c Switches  64 

V The  Lexicon 67 

V. 1  Introduction 67 

V.2  The  Dictionary  Format 68 

V.3  The  Dictionary  Manager 71 

V. 4  Auxiliary  Functions  73 

V. 4.a  Creating  Dictionary  Entries  73 

V.4.b  Dictionary  Files  76 

VI  Auxiliary  Functions  79 

VI.  1 Interfacing  with  the  LISP  Editor 79 

VI. 2 Displaying  ATN  networks  and  states 80 

APPENDICES 83 

A A Sample  ATN  Network 83 

B A Sample  Dictionary  86 

C Examples  of  Operation  87 


Vi 


D Examples  of  tracing  States  . 

E Examples  of  Tracing  Registers 

INDEX  

REFERENCES  


LIST  OF  FIGURES 

FIGURE 

1 . The  FLAMES  system  

2.  A Simple  Grammar  for  English  . 

3.  A Better  Grammar  for  English  . 

4.  Overview  of  the  ATM  Compiler  . 

5.  Simplification  Rules 

6.  An  Example  of  Simplification  . 


I 


Introduction 


1.1  Introduction 

This  thesis  discusses  one  part  of  a language  processing  system 
deve]  at  the  Coordinated  Science  Laboratory  of  the  University  of 
II  "his  system,  called  PLANES  [Waltz  75],  is  designed  to  be  a 

q.  jn  answering  system  which  accepts  and  answers  Questions 
concerning  a large  data  base  of  aircraft  maintenance  information. 
The  Intended  scope  of  the  PLANES  system  is  to  answer  such  questions 
as: 

How  many  Slcyhawks  required  engine  repairs  in  1973  • 

Did  any  of  these  log  more  than  200  hours  in  March. 

Show  me  the  tail  numbers  of  planes  which  were  NOR  during  May. 

What  kinds  of  aircraft  are  in  the  data  base? 

The  processing  of  a user's  request  is  divided  into  three  main  phases: 
parsing,  interpretation,  and  evaluation  (see  figure  1).  The  first 
phase,  the  parsing,  is  accomplished  by  a large  ATN  network  [Woods 
70].  This  network  defines  a "semantic  grammar"  [Brown  76]  which  maps 
the  users  request  Into  an  Internal  representation,  the  Paraphrase 
Language.  The  grammar  Is  "semantic"  in  that  It  incorporates  semantic 
and  pragmatic  knowledge  as  well  as  syntactic. 
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The  Paraphrase  Language  representation  of  the  user's  request  is 
translated  by  the  second  stage  into  a 'program'  to  generate  the  data 
to  answer  the  request.  This  stage  contains  detailed  knowledge  about 
the  data  bases.  It  must  know,  for  example,  which  sub-parts  of  the 
data  base  contain  certain  relations  and  the  particular  codes  used  to 
represent  certain  facts.  The  program  is  constructed  out  of  primitives 
provided  by  the  Query  Language. 

Finally,  the  Query  Language  evaluates  the  'program'  and  passes  the 
resulting  data  to  the  Response  Generator.  This  module  can  display  the 
answer  in  one  of  three  forms:  as  a simple  number  or  list,  as  a graph, 
or  as  a table.  The  choice  of  output  form  can  be  determined  by  the 
user  thru  a direct  request  (e.g.  "Draw  a graph  of  ...")  or  by  a set 
of  heuristics  which  attempt  to  find  the  most  "natural"  form.  A 
graph,  for  example,  will  be  generated  only  if  the  data*  consists  of  a 
set  of  tuples  which  can  be  interpreted  as  a function  of  two 
variables.  Furthermore,  the  number  of  tuples  must  lie  within  certain 
bounds. 

At  each  stage  of  the  process,  the  results  are  sent  to  the  History 
Keeper  which  manages  a set  of  stacks  of  relevant  information.  These 
stacks  contain  the  results  of  each  stage  (e.g.  users  request, 
paraphrase,  etc.},  syntactic  components  (e.g.  subject,  object,  etc.), 
and  semanticcontextual  information  (e.g.  time  specifications,  plane 
specifications,  etc.).  This  Information  is  made  available  for 


resolving  anaphoric  reference,  supplying  phrases  deleted  thru 
ellipsis,  and  generating  responses. 

The  entire  process  can  be  aborted  by  any  of  the  aajor  stages.  If 
this  is  done,  a suitable  error  message  is  generated  and  the  user  is 
Invited  to  reform  his  request. 

I .2  Aa  Overview  of  this  thesis 

\ 

■ This  thesis  is  Intended  to  both  document  the  Implementation  of  the 
ATN  „ interpreter  and  compiler  and  to  serve  as  a manual  for  anyone 
interested  in  using  it.  Chapter  II  gives  a brief  description  of 

ATN's  and  discusses  some  of  the  high  level  design  considerations. 
Chapter  III  describes  the  interpreter  and  the  auxiliary  functions 
available  to  the  user  in  some  detail.  Chapter  IV  presents  the 
compiler  which  can  translate  ATN  networks  into  LISP  code  or  machine 
language  instructions.  Chapter  V describes  the  dictionary  format 
expected  by  the  interpreter.  Also  discussed  are  the  various 

functions  provided  for  creating  and  maintaining  dictionaries. 
Chapter  VI  docuaents  several  packages  of  auxiliary  functions  provided 
for  interfacing  the  ATN  system  with  the  LISP  editor  and  Pretty- 
printer.  Appendices  include  a simple  parser  for  English,  a sample 
dictionary,  and  examples  of  their  operation.  An  index  to  function 
calls  and  global  variables  and  a bibliography  conclude  this  thesis. 
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1.3  Conventions 

This  thesis  documents  many  LISP  functions  and  "function- like" 
constructions.  As  part  of  the  description  for  a function,  a "syntax 
frame"  will  be  given  which  specifies  the  number  of  arguments  and 
whether  or  not  they  are  evaluated.  A "syntax  frame"  will  look  like  a 
typical  call  to  the  function,  i.e.  it  will  be  a list  whose  first 
element  is  the  name  of  the  function  and  whose  remaining  elements  are 
the  arguments.  Arguments  in  the  "syntax  fr2UDe"  will  typically  be 
enclosed  in  angle-brackets  (<,>).  For  example,  the  syntax  frame  for 
the  LISP  function  LESSP  might  be: 

(LESSP  <number>  <number>) 

which  states  that  LESSP  takes  two  arguments,  both  of  which  are 
evaluated. 

If  a quote-sign  (')  precedes  an  argument,  then  that  argument  is  not 
evaluated  by  the  function.  For  example,  the  following  describes  the 
function  DEFPROP: 

(DEFPROP  '<atom>  '<value>  '<property>) 

If  a function  takes  one  or  more  optional  arguments,  then  several 

"syntax  frames"  will  be  given.  For  example,  the  MacLISP  function 

CATCH  would  be  described  as: 

(CATCH  <expression>  '<tag>) 

(CATCH  <expresslon>) 

Two  conventions  will  be  used  to  exhibit  functions  which  take  an 
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indefinite  number  of  arguments.  In  one,  the  arguments  are  simply 
partially  enumerated  with  the  elliptical  as  in: 

(PLUS  <number1>  <numbe!-2>  ...  <numbern>) 

The  second  form  has  the  last  argument  to  the  function  preceded  by  the 
"dotted  pair"  dot,  as  in: 

(DEFUN  '<name>  '<argument  list>  . '<functlon  body>) 

The  last  argument  then  stands  for  an  indefinite  number  of  final 
arguments.  Example  calls  to  a function  will  usually  be  given  with 
the  "syntax  frame"  to  clarify  its  usage.  These  will  be  preceded  by 
an  "eg:". 
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The  ATN  formalism  has  been  well  received  by  the  computational 
linguistics  community  in  the  last  five  years.  A number  of  people 
were  working  with  similar  models  in  the  late  60 's,  but  the  idea  was 
crystallized  and  popularized  by  Woods  in  the  early  70 's  [Woods  71]. 
ATN  models  have  recently  been  applied  to  domains  other  that  the 
natural  language  parsing,  including  generation  of  natural  language 
[Kay  75],  and  computer  vision  research  [Lozano  77]. 

II. 2 lixfi  AlH  Formalism 

The  ATN  formalism  is  usually  described  in  a evolutionary  manner, 
showing  how  one  can  start  with  the  notion  of  a simple  finite  state 
machine  (FSM)  and  modify  and  embellish  it  to  arrive  at  an  Augmented 
Finite  State  Transition  Network  (ASFTN). 

The  FSM  ^ the  basis  for  ATN's 

We  start  with  a garden  variety  deterministic  FSM  consisting  of  a set 
of  labeled  states,  a set  of  ares  between  states,  an  initial  state. 
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and  a set  of  distinguished  final  or  accepting  states.  The  first  step 
might  be  to  extend  the  formalism  to  allow  for  non-determinism.  A 
simple  grammar  for  English  might  be  represented  by  the  graph  in 
figure  2. 

As  a model  for  natural  language  processing  this  is  Inadequate.  The 
most  serious  deficit  is  the  lack  of  any  mechanism  to  handle 
embedding  of  constituents  to  arbitrary  depth.  The  next  extension, 
then,  is  to  provide  a "push  down"  mechanism  whereby  one  can  suspend 
the  processing  of  a constituent  at  one  level  while  using  the  same 
network  to  process  an  embedded  one. 

Adding  recursion  extends  ihs.  cower  ^ sn  ISM 

We  can  represent  this  by  labeling  an  arc  from  one  state  to  another 
with  the  name  of  a third  state.  When  such  an  arc  is  encountered, 
processing  is  suspended  and  the  network  restarted  in  the  specified 
state.  If  an  accepting  state  is  eventually  reached,  the  suspended 
processing  is  resumed. 

Such  a capability  not  only  allows  one  to  achieve  the  power  to  parse  a 
context  ffee  language,  but  also  simplifies  the  representation  of  a 
grammar.  One  can  collect  common  network  fragments  and  specify  them 
in  one  place,  much  as  one  can  substitute  a subroutine  call  for  common 
program  sequences  in  a programming  language. 


Our  simplistic  English  grammar  might  now  be  represented  by  the  graph 


Figure  2 

A Simple  Grammar  for  English 
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in  figure  3-  Note  that  this  grammar  now  accepts  conjoined  sentences 
and  sentences  embedded  as  relative  clauses. 

Augaentatlon  realizes  power  ^ ATN’s 

Recursive  transition  networks  are  still  Inadequate  for  natural 
language  processing.  To  further  extend  the  power  of  our  formalism  we 
will  allow  arcs  to  be  augmented  with  conditions  and  actions.  Each 
arc  can  be  augmented  with: 

1 An  arbitrary  condition  which  must  be  met  before  the  arc  may  be 
taken. 


2 A set  of  arbitrary  actions  which  will  be  performed  if  the  arc 
is  taken.  These  can  have  side  effects  thru  the  setting  of 
registers  whose  contents  can  later  be  accessed  in  other  parts 
of  the  grammar. 

This  augmentation  gives  an  ATN  the  computational  power  of  a Turing 
machine.  The  final  result  is  a powerful  formalism  which  retains  much 
of  the  elegance  of  the  simple  FSM. 


II. 3 Design 


1 

I 


I 

* 

t 

\ 


Since  ATN's  are  based  on  non-detemlnlstlc  FSM's,  an  implementation 
must  be  able  to  simulate  all  possible  paths  thru  the  network.  We 
have  chosen  to  Implement  a depth- first  backtracking  version  for 
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several  reasons.  Firstly,  It  is  our  belief  that  natural  language 
processing  may  be  most  suited  to  this  approach.  At  least,  evidence 
suggests  that  our  only  example  of  competent  natural  language 
processors  (i.e.  humans)  do  not  use  highly  parallel  methods. 
Secondly,  we  believe  that  whatever  gains  might  be  made  thru  a 
parallel  approach  would  be  more  than  offset  by  the  additional  cost  of 
processing.  The  phenomenon  of  natural  language  is  such  that  it 
encompasses  an  extraordinarily  large  number  of  patterns  and 
constructions.  Most  of  these,  however,  are  relatively  rare,  the 
greatest  part  of  most  language  being  drawn  from  a few  common 
constructions.  A non-parallel  approach  can  easily  take  advantage  of 
this  fact  in  the  ordering  of  its  attempts  to  find  a path  thru  the  ATN 
network. 

To  simulate  the  non-determlnlstlc  nature  of  an  ATN  processor  (or  a 
FSM  for  that  matter)  one  needs  to  maintain  an  encoding  of  the  current 
configuration  (i.e.  state,  input  word,  register  environment,  etc.)  as 
well  as  a stack  of  past  configurations  which  represent  the  path  so 
far.  In  our  ATN  system,  we  have  chosen  a recursive  implementation  in 
which  this  information  is  stored  on  LISP's  Internal  stack.  The 
alternative  is  to  maintain  an  explicit  configuration  stack  built  from 
CONS  cells  from  the  general  pool.  The  advantages  of  a recursive 
implementation  include: 

1 Simplicity  of  design 

Backing  up  to  a previous  configuration  is  achieved  by 
"returning”  to  the  environment  of  that  configuration.  The 
undoing  of  side-effects  Is  automatic. 
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2 Efficiency 

Previous  configurations  are  stored  on  LISP's  internal  stack  and 
thus  do  not  consume  CONS  cells  from  the  general  pool.  This 
reduces  the  amount  of  LIST  space  required  and,  more  importantly, 
results  in  fewer  and  shorter  garbage  collections. 

The  chief  disadvantage  of  a recursive  implementation  is  that  we  give 
up  the  possibility  of  giving  the  user  complete  control  over 
processing.  Earlier  configurations  are  Inaccessible  to  the  user  as 
well  as  to  the  system  itself.  This  problem  is  alleviated  by  the 
inclusion  of  several  flow-of-control  modifiers  thru  which  one  can 
directly  fail  back  to  a specified  configuration. 

II. U Implementation 

The  ATN  interpreter.  Dictionary  Manager,  and  ATN  compiler  are  written 
in  the  LISP  dialect  MacLISP,  which  runs  on  a PDP>10  T0PS10 
timesharing  system.  The  following  gives  approximate  values  for 
memory  requirements: 

LIST  space  BPS  space 

Interpreter  5100  4000 

Dictionary  Manager  1200  1000 

Compiler  2500  1500 

In  this  table,  LIST  space  refers  to  the  number  of  memory  cells  used 

to  store  lists,  atoms,  etc.  and  BPS  space  refers  to  'Binary  Program 

Space* , the  number  of  memory  cells  used  to  store  compiled  code. 
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III 

Illfi  iUi  Interpreter 


III . 1 Invoking  ihfi  un  laterorttcr 

parse  And  PARSE1  invoke  lii£  interpreter 

PARSE  and  PARSE 1 are  the  main  functions  used  to  invoke  the  ATN 
Interpreter  on  an  input  string.  The  function  PARSE  is  Intended  to  be 
used  at  LISP's  toplevel  and  PARSE1  used  by  other  programs. 

(PARSE  '<3tring>  '<inltlal  state>  '<nuinber  of  parses>  ) 
eg:  (PARSE  (which  skyhawks  required  repairs)  s:start  3) 

The  function  PARSE  starts  the  ATN  interpreter  in  state  <initial 
state>  with  the  input  string  <string>.  It  takes  from  0 to  3 
arguments,  none  of  which  is  evaluated.  If  any  argument  is  omitted  or 
is  NIL,  it  is  defaulted  as  specified  below. 

The  first  argument  specifies  the  input  string  to  be  parsed.  If  it  is 
a list  of  atoms,  then  that  list  is  used  as  the  input  string.  If  it 
is  an  atom  which  has  a value,  then  the  value  of  the  atom  is  used  as 
the  input  string.  If  it  is  omitted  or  is  NIL,  then  the  function 
RBADTBXT  is  called  which  reads  the  input  string  from  the  TTY. 
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The  second  argument  to  PARSE  specifies  the  Initial  state  of  an  ATN 
network.  If  it  is  omitted  or  NIL,  it  defaults  to  the  value  of  the 
global  variable  9INITIALSTATE . 

The  third  argument  determines  how  many  parses  are  produced.  If  it  is 
emitted  or  NIL  it  defaults  to  one.  If  it  is  a positive  nunber,  then 
the  Interpreter  will  attempt  to  produce  that  many  parses.  If  it  is 
the  atom  ALL,  then  all  possible  parses  will  be  produced. 

The  function  PARSE  always  returns  the  value  so  it  is  run  for  its 

side-effects.  The  principle  side-effect  is  that  it  sets  the  the 
variable  PARSE  to: 

1 The  atom  PAIL  if  the  ATN  was  unable  to  reach  a final  state. 

2 The  value  POPped  by  the  ATN  if  <number  of  parses>  is  1 . 

3 A list  of  the  values  POPped  if  <number  of  parses>  is  greater 
than  one  or  the  atom  ALL. 

In  addition,  this  value  is  "pretty  printed"  on  the  TTY  and  the  atom 
AGAIN  is  set  to  the  input  string. 

(PARSE1  <strlng>  <inltlal  state>  <nufflber  of  parses>  } 
eg:  (PARSE1  '(show  me  the  list)  's:start  'all) 

The  function  PARSE1  requires  all  three  arguments,  each  of  which  is 
evaluated.  It  returns  the  value(s)  POPped  by  the  ATN  from  the  final 
state(s).  PARSB1  has  the  side-effect  of  binding  the  global  variable 

AGAIN  to  <input  strlng>.  I 

! 

I 
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(READTEXT) 


If  the  first  argument  to  PARSE  is  omitted  or  NIL,  the  function 
READTEXT  is  called  to  read  the  input  string  from  the  TTY.  This 
function  reads  a string  of  characters  from  the  TTY  until  a carriage- 
return  is  typed  and  returns  it  in  the  form  of  a list  of  atoms. 
Characters  which  have  a "special"  meaning  to  LISP  (e.g.  . , ; , etc.) 
are  automatically  "slashified"  (i.e.  taken  literally). 


The  function  incorporates  three  special  features: 

1 If  the  null  string  is  entered  by  typing  only  a carriage-return, 
then  READTEXT  returns  the  last  input  string  given  to  PARSE  (i.e. 
the  value  of  AGAIN). 

2 If  the  first  character  on  the  line  is  a "("  then  READTEXT  reads 
a LISP  s-expresslon,  evaluates  it,  and  prints  the  resulting 
value.  It  then  waits  for  a string  of  characters  to  be  typed  as 
the  input  string. 

3 Whenever  a line- feed  is  typed,  READTEXT  takes  the  next  character 
of  the  input  string  from  the  last  input  string  (AGAIN  again). 
Subsequent  line- feeds  produce  subsequent  characters  from  the  old 
input  string. 


III. 2 Arc  Types 


Our  ATN  interpreter  provides  a total  of  13  varieties  of  arcs.  These 
can  be  broken  down  into  the  following  categories: 

1 Flow  of  Control  (PUSH,  POP,  TO,  JUMP,  FAIL) 

These  are  arcs  which  effect  transitions  from  one  state  to 
another  or  Invoke  or  return  from  sub-computations. 

2 Testing  the  Current  Word  (CAT,  WRD,  ROOT,  PHRASE) 

These  arcs  are  taken  only  when  the  current  word  meets  a 
specified  condition. 


3 Virtual  Arc  (VIR) 

This  arc  provides  a natural  mechanism  for  handling  constituents 
which  are  parsed  "out  of  place"  in  the  input  string. 

4 Miscellaneous  Arcs  (TST,  DO,  AND) 

These  three  arcs  provide  a mechanism  for  miscellaneous 
computation. 


All  of  the  arcs  except  the  AND  arc  have  the  basic  form: 

(<type>  <head>  <test>  <action1>  ...  <actionn>  ) 
where  <type>  is  one  of  the  arc  types  (PUSH,  CAT,  TST,  etc),  <head>  is 
an  argument  for  that  arc  type,  <test>  is  an  arbitrary  condition  which 
must  be  met  before  the  arc  can  be  taken,  and  the  <actions>'s  are 
arbitrary  lisp  expressions  which  are  evaluated  if  the  arc  is  taken. 
Typically  the  last  action  will  specify  destination  - the  next  state 
to  go  to. 


III. 2. a Arcs  which  teat  the  current  word 


ins  MM  ggaaarea  Ihs.  current  aord 

(WRD  '<word(s)>  <test>  . . .<actlons>. . . <destination>) 
eg:  (VfRD  (the  a)  T (SETR  det  •)(T0  np:det)) 

(WRD  please  t (SETR  polite  t)(T0  s:start)) 


The  WRD  arc  compares  the  current  word  against  <word(s)>,  which  can  be 
an  atom  or  a list  of  atoms.  If  the  current  word  matches,  then  <test> 
is  evaluated  and,  if  true,  the  actions  are  evaluated  from  left  to 


right. 


£AI  &£:£  checks  ^ category 


(CAT  '<head>  <te3t>  . . . <actions> . . . <destination>) 
eg:  (CAT  adj  T (ADDR  adj  •)(T0  np:adj)) 

(CAT  (n  npr)  t (SETR  n •)(TO  np:n)) 

A CAT  arc  is  taken  if  the  current  word  has  a lexical  entry  under  the 
category  <head>  (if  it  is  an  atom)  or  any  of  the  categories  <head>  if 
a list.  Within  the  CAT  arc,  the  variable  * is  bound  to  the  root  form 
of  the  current  word.  For  example,  if  the  current  word  is  CRASHING, 
then  the  following  arc  would  set  the  register  VERB  to  CRASH: 

(CAT  V T (SETR  verb  »)(T0  s:verb)) 

Note  that  this  arc  may  generate  several  possibilities  if  the  current 

word  has  more  than  one  dictionary  entry  under  the  specified  category. 

For  example,  the  word  SAW  might  have  the  dictionary  entry: 

(SAW  V ((SEE  (tns  past)) 

(SAW  (tns  present) (untensed) )) 

In  this  case  the  CAT  arc  above  will  first  try  parsing  SAW  as  the  past 
tense  of  SEE  and,  if  that  fails,  will  try  parsing  SAW  as  the 
present/untensed  form  of  the  verb  SAW. 

The  ROOT  arc  checks  the  root  form 

(ROOT  '<root(s)>  <test>  . . .<action3>. . . <destination>) 
eg:  (ROOT  BE  T (SETR  v *)(to  S:BE)) 

(ROOT  (have  be)  (NULLR  aux)(SETR  aux  •)(T0  s:aux)) 

A ROOT  arc  compares  the  root  form  of  the  current  word  to  <root(s)>. 
Again,  <root(s)>  can  be  an  atom  or  a list  of  atoms.  The  arc  is  taken 
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if  the  current  word  matches  one  of  the  forms  in  <root(s)>.  As  with 
the  CAT  arc,  the  variable  • is  bound  to  the  root  form  of  the  current 
word  within  the  ROOT  arc. 


lilfi  PHRASE  gheckg  IHe  next  several  words 

(PHRASE  '<phrase>  <test>  . . .<actions> . . . <destination>) 
eg:  (PHRASE  (more  or  less)  T (SETR  fuzzy  T)(TO  quant:end)) 
(PHRASE  (please  (give  show  tell)  me)  T (TO  s : imperative) ) 
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The  PHRASE  arc  is  used  to  compare  the  next  several  words  in  the  input 
string  to  <phrase>.  This  argument  should  be  a list  of  atoms  or  lists 
of  atoms. 

Successive  elements  of  <phrase>  are  matched  against  successive  words 
from  the  input  string.  An  element  which  is  a list  enumerates  several 
possibilities  for  the  corresponding  word  in  the  input  string.  The 
arc  is  taken  if  and  only  if  each  element  of  <phrase>  is  successfully 
matched  against  the  input  string. 

III.2.b  Area  wHlab  MadiflC  IHa  Flow  ^ control 

These  arc  types  can  de  divided  into  those  which  involve  sub- 
computations  (PUSH,  POP,  POP!),  those  effecting  state  transitions 
(TO,  JUMP)  and  one  which  controls  backtracking  (FAIL). 

lilt  211SH.  ACa  loxalffiA  l aub-oomputatlon 

(PUSH  '<state><test>  ...  <actlons>...  <destinatlon>) 
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eg;  (PUSH  np  t (SENDR  goal  'subj)(SETR  subj  •)(TO  s:subj)) 

The  PUSH  arc  is  used  to  Invoke  a sub-computation.  If  the  expression 
<test>  is  true,  then  the  ATN  interpreter  is  recursively  called, 
starting  in  state  <state>.  All  registers  at  the  current  level  are 
saved  and  made  invisible  to  the  lower  level.  Before  actually 
invoking  the  lower  level , the  list  of  actions  is  scanned  for 
"preactions" , i.e.  those  which  should  be  evaluated  before  the  PUSH  is 
made.  These  "preactions"  are  SENDR 's  or  any  action  which  begins  with 
a "!"  (e.g.  (!  (OR  (EQ  $v  ’be) (SENDR  v))).  The  SENDR  actions,  and 
any  previously  done,  are  typically  used  to  set  register  values  in  the 
lower  computation. 

If  the  sub-computation  fails,  i.e.  does  not  reach  a state  which  can 
take  a POP  arc,  then  the  PUSH  falls.  If  a POP  from  the  lower  level 
is  taken,  then  control  is  returned  to  the  PUSH  arc.  The  variable  • 
la  then  set  to  the  value  POPped  by  the  lower  network,  the  values  of 
any  LIFTRed  registers  set,  and  the  "poatactiona"  evaluated.  These 
"post-actions"  are  Just  the  actions  on  the  arc  minus  the  "pre- 
actions". 

The  POP  are  returns  from  ^ sub-computation. 

(POP  <value>  <test>  . <actions>  ) 

The  POP  aro  returns  from  a sub-computation.  If  <test>  is  true,  the 
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<actions>  are  evaluated  from  left  to  right  and  then  <value>  is 
computed  and  returned  to  the  most  recent  PUSH  arc  or  PUSHATN  action. 
Both  <test>  and  <actions>  can  be  omitted.  If  the  <test>  argument  is 
omitted,  it  defaults  to  T. 

Two  conditions  affect  the  taking  of  a POP  arc.  If  any  items  have 
been  placed  on  the  HOLD  list  at  this  level,  the  POP  arc  is  blocked 
(see  section  II.5.e).  In  addition,  a POP  arc  will  not  be  taken  if  it 
would  be  returning  to  the  "top-level"  and  words  remain  to  be  parsed 
(i.e.  STRING  is  non-NIL).  This  last  condition  can  be  turned  off  if 
the  global  variable  USE-ALL-WORDS?  is  set  to  NIL. 

Some  examples  of  POP  arcs  are; 

(POP  (build-np)  T (LIFTR  head-noun  $noun)) 

(POP  (BUILDQ  (pp  ♦ +)  prep  np)) 

ins  £Q£1  ITS  really  returns 

(POP!  <value>  <test>  . <actions>  ) 

The  POP!  arc  is  almost  identical  to  the  POP  are  except  for  one 
important  difference:  one  can  not  back-up  into  the  sub-computation 
after  leaving  it  via  a POP!  are.  Under  our  implementation  of  the  ATN 
interpreter,  a POP!  arc  does  a RETURN  to  the  environment  of  the  most 
recent  PUSH  are  or  call  to  the  PUSHATN  action.  If  failure  later 
backs  up  to  this  point,  the  entire  sub-computation  will,  in  effect, 
be  backed  up  over  as  well.  The  grammar  writer  may  find  this  a useful 
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arc  In  that  it  gives  hioi  greater  control  over  the  automatic 
backtracking  done  by  the  ATN  interpreter. 

Infi  la  And  JUMP  arcs  effect  state  transitions 

(TO  '<state>  <test>  . <actlons>} 

(JUMP  '<state>  <test>  . <actions>) 

As  a convenience,  special  arcs  are  provided  which  duplicate  the  TO 

and  JUMP  actions.  Both  arc  types  cause  the  ATN  interpreter  to  advance 

to  the  state  specified  by  their  first  "argument".  The  TO  arc  causes 

the  current  word  to  be  advanced  and  the  JUMP  arc  does  not.  In  each 

case,  the  <test>  argument  and  the  <actions>  arguments  are  optional. 

If  the  <test>  is  omitted,  it  defaults  to  T.  Some  examples  are: 

(JUMP  s:end  (out-of-words?)) 

(JUMP  np:det) 

(TO  np:det  t (ADOR  ignored-words  *)) 

Illft  IAIL  ACfi  gantrala  backtracking 
(FAIL  '<where>  <test>  . <actlons>  ) 

The  FAIL  arc  can  be  used  to  gain  some  control  over  backtracking.  If 
the  <test>  is  true,  then  this  arc  will  evaluate  the  <actlons>  and 
cause  failure  to  propagate  backwards  to  a point  specified  by  <where>. 
Again,  the  <test>  and  <actions>  arguments  may  be  omitted. 
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The  <Mhere>  argument  may  specify  that  the  ATN  interpreter  fail  from 


the  current  arc,  the  current  state,  the  current  sub-coaputatlon,  an 

arbitrary  named  state,  or  to  the  Initial  top-level  call.  The  precise 

options  for  the  <where>  argument  are: 

STATE  : fall  from  the  current  state 

PUSH  : fall  from  the  current  subcomputation 

(i.e.  from  the  most  recent  PUSH) 

<state>  : fall  from  the  named  state 

TOP  : fail  altogether 

Note  that  the  setting  of  registers  in  the  <actlons>  will  have  no 

effect  since  their  side  effects  will  be  undone  by  the  failure.  Some 

example  of  FAIL  arcs  are: 

(PAIL  state) 

(FAIL  np:adj  (CAT  v)) 

(FAIL  push  (NOT  (CAT  prep))) 

III.2.C  The  Virtual  arc 

(VIR  '<cat>  <test>  ...<actions>.. . <destination>  ) 

(VIR  '(<cat>  <level>)  <test>  . . .<actions> . . . <destlnation>  ) 

The  VIR  or  virtual  arc  is  taken  if  a constituent  is  found  on  the  HOLD 
list  indexed  under  the  category  <cat>.  Again,  the  <test>  argument 
must  evaluate  to  non-NIL. 

Note  that  two  syntax  frames  are  given  for  this  are.  In  the  first, 
the  "head”  of  a VIR  arc  is  an  atomic  category  name.  In  the  second, 
it  is  a tuple  whoes  first  element  is  a category  name  and  whose  second 
is  a number  which  specifies  the  level  at  which  the  constituent  is  to 
be  found  on  the  HOLD  list.  See  section  Ill.S.e  for  details. 


If  the  VIR  arc  can  be  taken,  • is  bound  to  the  constituent  found  on 
the  HOLD  list  and  FEATURES  is  bound  to  the  (optional)  feature  list 
associated  with  that  constituent. 


iii.E.d  Miacelianeoua  Arcs 

Ibfi  ISI  applies  ^ arbitrary  .tfial 

(TST  '<label>  <test>  . . .<actlons> . . . <destlnation>  ) 

The  TST  arc  la  taken  If  the  arbitrary  LISP  expression  <teat> 
evaluates  to  non»NIL.  The  <label>  argument  Is  not  evaluated  and  Is 
not  used  in  any  way  by  the  ATN  Interpreter.  Typically  It  Is  used  for 
a mnenomlc  label  describing  the  function  of  the  arc.  Some  examples 

are: 

(TST  end-of-sent?  (AND  (NULL  •XNULL  string) )(JUMP  s:end)) 

(TST  t (MEMO  * cuss-words) (TO  stcomplaln)) 

me  J2a  ACC  liaa  CC  destination 

(DO  '<label>  <te8t>  . <actlons>) 

The  ATN  Interpreter  handles  the  DO  are  in  an  identical  manner  to  the 


TST  arc . 

If  the  <test> 

is 

true,  the 

<aetlons> 

are  evaluated. 

Its 

intended 

use,  however, 

is 

somewhat 

different. 

The  TST  arc 

is 

intended  to  have  a "destination  action”  as  its  final  action  (i.e.  a 
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call  to  TO  or  JUMP).  The  DO  arc  la  intended  to  be  uaed  without  a 

"destination  action".  Thus,  the  DO  arc  might  be  used  to  initialize  a 

set  of  registers.  Some  sample  DO  arcs  are  given  below: 

(DO  initialize  t (SETR  type  'dcl)(SETR  context  NIL)} 

(DO  warn  verbose?  (print  '{That  is  not  very  grammatical! i )) 

Illfi  MU  conloina  other  .accA 

(AND  <arc1>  <are2>  ...  <arcn>  ) 

The  AND  arc  is  the  only  one  which  does  not  fit  the  general  arc 
syntax.  The  AND  arc  takes  an  Indefinite  number  of  arguments,  each  of 
which  can  be  any  of  the  legal  arc  types  (including  another  AND  arc). 
The  effect  of  an  AND  arc  is  to  evaluate  each  of  the  "sub-arcs"  from 
left  to  right  until  one  of  them  fails  or  the  last  one  is  reached. 
Consequently,  none  of  the  "sub-arcs"  should  contain  a "destination 
action"  except  the  last  one,  which  should  have  a "destination"  as  its 
last  action. 

Note  that  after  evaluating  each  of  the  "sub-arcs",  the  current  word 
is  always  advanced.  This  arc  is  useful  in  eliminating  many  states 
which  contain  only  a single  arc.  This  can  greatly  increase  the 
readability  of  a network  by  keeping  together  arcs  which  form  a single 
path.  For  example,  to  handle  two-word  comparative  adjective,  one 
might  use  the  following  arc: 

(AND  (WRD  more  t) 

(CAT  adj  t (SETR  adj  (BUILDQ  (adj:  • comparative))} 

(TO  np:adj))) 
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III. 3 Defining  AUi  Metworka  ^ Statfia 
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Our  ATN  system  provides  functions  for  defining  ATN  networks,  states, 
Individual  arcs  and  special  word- triggered  Interrupt  functions. 

DEFATN  defines  ^ ABL  network 

(DEFATN  '<name>  ' <default-arcs>  '<default-reglsters> 

'(<state>  ...  <state>}) 

The  Ainctlon  DEFATN  takes  from  2 to  4 arguments  and  defines  an  ATN 
network.  The  first  argument  Is  the  name  of  the  ATN  network  and  the 
last  argument  Is  a list  of  the  states  In  the  network.  Intervening 
arguments  are  optional  and.  If  included,  can  specify  Initial  values 
for  registers  and  special  "default  arcs”  which  are  assumed  to  be  the 
Inltlad  arcs  of  each  state  In  the  network. 

If  a register  default  argument  Is  supplied.  Its  form  should  be: 

("default-register"  (<register  name>  <value>) 

(<reglster  name>  <value>) 

...  )) 

This  argument  causes  the  specified  registers  to  be  set  to  the 
specified  Initial  values  whenever  a state  In  the  network  Is  PUSHed 
to. 


If  a "default-arc"  argument  Is  given  to  DEFATN,  Its  form  should  be: 

("default-arc"  <arc1>  <arc2>  ...  <arcn>  ) 
where  an  arc  has  one  of  the  forms: 

<aro  name> 

(<aro  type>  <head>  ....  ) 
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If  the  arc  Is  an  atom,  it  Is  assumed  to  be  the  name  of  separately 
defined  arc  (see  DEFARC).  If  it  is  a list,  then  it  should  be  in  the 
form  of  a legally  defined  arc.  The  effect  of  this  argument  is  to 
cause  the  specified  arcs  to  be  added  at  the  beginning  of  each  state 
in  the  network. 

DEfSIATE  deflnea  iui  IndiYlduai  atate 
(DEFSTATE  •<name>  . '<arcs>  ) 

Individual  ATN  states  can  be  defined  with  the  function  DEFSTATE. 

This  function  takes  an  indefinite  number  of  arguments,  the  first  of 

which  is  the  name  of  the  state.  The  remaining  arguments  specify  the 

arcs  leading  from  the  state.  Again,  an  are  can  be  represented  either 

by  an  atomic  name  or  by  a complete  list.  An  example  is: 

(DEFSTATE  np: start 

(CAT  det  (SETR  det  •XTO  nprdet)) 

(CAT  pro  (SETR  noun  •)(T0  np:end)) 

(JUMP  np:det}) 

DEFARC  defines  an  individual  are 

(DEFARC  '<name>  '<are>) 

An  individual  arc  can  be  defined  with  the  function  DEFARC.  Ares 
defined  in  this  way  can  be  of  two  type?:  Regular  arcs  which  have  a 
destination  (i.e.  a call  to  TO  or  JUMP)  as  their  last  action  and 
degenerate  aros  which  do  not  have  a destination  as  their  last  action. 
Nhen  a degenerate  are  is  inserted  into  an  ATN  state,  a destination  is 
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supplied  which  causes  the  ATN  to  loop  back  to  that  state.  For 
example: 

(DEFARC  timephrase  (PUSH  tlmepp  T (SETR  time  •))) 

Is  an  arc  which  will  attempt  to  parse  a time  phrase  and  then  reenter 
the  current  state. 

JdQCl  Interrupts  .axis  created  lux  DEFINTERUPT 

Our  ATN  system  provides  a mechanism  whereby  certain  computations  can 
be  performed  whenever  a certain  word  Is  encountered  in  the  Input 
string.  The  function  DEFINTERUPT  associates  an  Interrupt  function 
with  a word. 

(DEFINTERUPT  •<word(s)>  •<functlon>) 
eg:  (DEFINTERUPT  please  cdr) 

(DEFINTERUPT  (in  on  before) 

(lambda  (x}( parse- tlme-pp  (car  x)(cdr  x)))) 

The  first  argument  to  DEFINTERUPT  should  be  a word  or  a list  of 
words.  The  second  should  be  a function  of  one  argument,  expressed  as 
either  the  atomic  name  of  the  function  or  as  a lambda  expression. 
Uienever  one  of  the  specified  words  Is  encountered  In  the  Input 
string,  the  associated  interrupt  function  is  called  with  the  Input 
string  as  Its  argument.  Its  return  value  Is  used  as  the  new  Input 
string.  Thus  the  first  example  would  cause  the  word  PLEASE  to  be 
Ignored  (since  it  returns  the  CDR  of  the  Input  string).  The  second 
example  would  call  the  function  PARSE-TIME-PP  whenever  the  words  IN, 
ON,  or  BEFORE  are  discovered  in  the  Input  string. 


rffmi 
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III. 4 Functlona  XS£  Manipulating  Registers 


A register  in  the  ATN  system  plays  the  role  of  a variable  in  a 
typical  high  level  programming  system.  The  Interpreter  provides 
functions  for  assigning  a register  a vidue  (SETR,  SENDR,  LIFTR), 
accessing  the  value  of  a register  (GETR,  $),  and  testing  the  contents 


of  a register  (NULLR). 

Registers  need  not 

be  declared 

and 

are 

"created"  when  they  are 

assigned  a value  for 

the  first 

time. 

The 

contents  of  a register 

which  has  never  been 

assigned  a 

value 

is 

defined  to  be  NIL. 

III. 4, a Setting  Registers 

Registers  can  be  set  at  the  current  level  (SETR)  in  the  next  lower 
level  (SENDR)  or  in  any  of  the  higher  levels  (LIFTR). 

SEXl  gfiig  A rflgiater  j&l  Iht  aurrent  level 

(SETR  '<register  name>  <value>  ) 
eg:  (SETR  verb  'be) 

The  function  SETR  assigns  a value  to  a register  at  the  current  level. 
SETR  assigns  the  register  <register  name>  the  value  <value>. 

SENDR  j register  at  g lower  level 
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(SENDR  '<raglster  naffle>  <value>) 


(SENDR  '<register  naine>) 


SENDR  is  used  to  set  the  contents  of  a register  at  a lower  level, 
typically  Just  before  a PUSH  action.  Whenever  a SENDR  is  evaluated, 
the  register  name  and  value  are  stored  on  the  SENDLIST.  When  a PUSH 
occurs,  the  SENDLIST  is  used  to  initialize  the  registers  to  the 

specified  values.  If  the  second  argument  is  omitted,  it  defaults  to 
the  current  contents  of  the  register  named  by  the  first  argument. 
Thus  (SENDR  noun)  is  the  equivalent  to  (SENDR  noun  (GETR  noun)). 

LIFTR  sets  ^ register  iit  A tiUheC  ISYSl 

The  function  LIFTR  is  used  to  set  the  contents  of  a register  at  a 
higher  level.  It  takes  from  one  to  three  arguments,  the  syntax 

being: 

(LIFTR  '<reglster  naffle>  <value>  <level>) 

(LIFTR  '<reglster  name>  <value>) 

(LIFTR  ’<register  name>) 

If  all  three  arguments  are  given,  LIFTR  assigns  <reglster  name>  the 
value  <value>  <level>  levels  above  the  current  one.  For  example, 

(LIFTR  stype  'q  1)  would  set  the  register  STYPE  to  Q in  the  next 

highest  level.  If  the  <level>  argument  is  the  atom  TOP,  then  it 
refers  to  the  highest  level.  If  the  <level>  argument  is  omitted,  it 
defaults  to  1 (l.e.  the  next  highest  level). 


The  second  argument,  <value>,  specifies  the  value  to  be  assigned  to 
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j’  the  register.  If  It  is  omitted,  the  current  contents  of  the  register 

are  used.  Thus  (LIFTR  noun)  is  equivalent  to  (LIFTR  noun  (GBTR 
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noun) ) . 
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When  a LIFTR  action  is  evaluated,  it  adds  the  complete  three  tuple 
(<reglster  name><value><level>)  to  the  special  register  LIFTLIST. 
When  a POP  action  is  taken,  the  LIFTLIST  is  processed  by  setting 
those  registers  associated  with  a <level>  of  1 and  "re-lifting"  the 
rest  of  the  elements  with  a decremented  <level>.  Thus,  the  effect  of 
a LIFTR  is  only  seen  once  we  have  POPped  to  the  appropriate  level. 

Ill.U.b  Accessing  aeglafcg.ia 

GETR  gets  the  contents  of  a register 

The  function  GETR  is  used  to  access  the  contents  of  a register  at  the 

current  level  or,  if  an  optional  argument  is  supplied,  at  any  higher 

level.  The  syntax  is: 

(GETR  '<reglster  naffle>) 

(GETR  '<reglster  naffle>  <level>) 

(GETR  '<register  naffle>  'nearest  <test>) 

If  only  one  argument  la  supplied,  GETR  retrieves  the  contents  of  that 

register  at  the  current  level.  An  optional  second  argument  instructs 

GETR  to  get  the  contents  of  the  register  at  a higher  level.  If 

<level>  evaluates  to  the  atom  NEAREST,  then  GETR  will  return  the 

contents  of  the  register  at  the  lowest  level  in  which  it  has  a value. 


I 


i 

I 


r 


32 


If  <level>  is  NEAREST,  an  optional  third  argument,  <test>,  can 
specify  an  arbitrary  teat  which  must  be  met  by  the  candidate  register 
value.  While  evaluating  <test>,  the  variable  * is  bound  to  the 
contents  of  the  candidate  register.  For  example,  the  expression: 

(GETR  verb  'nearest  '(TRANSITIVE  »)) 
would  search  upwards  thru  the  chain  of  levels  until  it  found  a VERB 
register  which  contained  a transitive  verb. 

Ill£  character  1 acta  ^ prefix  operator 

The  character  $ (i.e.  dollar  sign)  is  defined  to  be  a "read  macro 
character"  which  acts  as  a prefix  operator  identical  to  GETR.  Thus 
$noun  is  the  equivalent  to  the  function  call  (GETR  noun).  This 
feature  can  be  turned  on  or  off  with  the  function  DOLLARMACRO. 
Evaluating  (DOLLARMACRO  NIL)  will  turn  it  off  causing  the  dollar  sign 
to  lose  its  special  significance.  Similarly,  (DOLLARMACRO  T)  will 
turn  the  feature  back  on. 

111.4.0  Testing  contents  ^ ^ register 

MtlLLR  teats  the  value  of  j register 

Since  the  value  of  a register  which  has  never  been  set  is  defined  to 
be  NIL,  there  is  no  way  to  tell  whether  or  not  a register  has  been 
set  to  the  value  NIL  or  has  never  been  set.  To  provide  the  user  with 
this  capability,  the  function  NULLR  will  return  T if  and  only  if  a 
register  has  been  explicitly  set  to  NIL.  the  syntax  is  simply: 


(NULLR  '<regiater  natne>) 


III. 5 Flow  ill  Control 

Six  function  are  provided  which  modify  the  flow  of  control.  Five  of 
them  are  analogous  to  arcs  PUSH,  POP,  TO,  JUMP  and  FAIL.  The  sixth 
provides  a special  facility  by  which  one  can  save  the  state  of  a 
subcomputation  Just  before  returning  via  a POP  and  later  resume  it 
with  the  same  register  environment. 

III. 5. a Effecting  State  transitions 

TO  and  JUMP  advance  the  ATM  to  ^ new  state 

These  are  typically  found  as  the  last  action  at  the  end  of  an  arc,  as 
in: 

(CAT  V t (SETR  V •)(T0  srverb)) 

(TST  T (out-of-words) (JUMP  s:end)) 

They  can,  however,  be  called  at  any  point  by  the  user  and  embedded  in 

arbitrary  LISP  expressions. 

(TO  '<3tate  naffle>) 
eg:  (TO  srverb) 

The  TO  function  effects  a transition  to  the  named  state  and  causes 

the  current  word  to  be  advanced. 

(JUMP  '<state  naffle>) 
eg:  (JUMP  s:end) 
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The  JUMP  function  also  causes  the  ATN  Interpreter  to  enter  the 
specified  state  but  does  not  advance  the  Input  word.  The  next  state 
will  find  the  current  word  unchanged. 

I11.5.b  Invoking  aod  Returning  from  Sub-computations 

PUSHATN  aiaulatfia  a mu 

(PUSHATN  <atate>) 
eg:  (PUSHATM  np:start) 

The  effect  of  a call  to  the  function  PUSHATN  is  similar  to  that  of  a 
PUSH  arc.  It  recursively  applies  the  ATN  network  to  the  input  string 
starting  in  state  <atate>.  The  value  returned  by  this  function  is 
the  value  POPed  by  the  sub-computation  or  the  atom  FAIL  if  the  sub- 
computation  does  not  reach  a final  state.  Additional  side-effects 
are: 

1 The  variable  * is  bound  to  the  value  POPped. 

2 The  variable  LEX  is  bound  to  the  new  current  word. 

3 The  variable  STRING  is  bound  to  the  rest  of  the  input  string. 

PQPATM  aimulatea  a POP 

(POPATN  <value>) 
eg:  (POPATN  (buHdquestlon) ) 
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The  POPATN  function  is  similar  to  the  POP  arc.  It  causes  control  to 
return  to  the  most  recent  PUSH  arc  or  PUSHATN  function  call.  As  with 
the  POP  arc,  the  POP  will  only  succeed  if  nothing  has  been  placed  on 
the  HOLD  list  at  the  current  level. 

] 

III.5.C  Controlling  Backtracking  AOd  FaliUTt 
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EAIL  controls  backtracking 
(FAIL  <where>) 

The  FAIL  function  is  almost  identical  to  the  FAIL  arc.  It  causes  the 

ATN  interpreter  to  backtrack  to  a point  specified  by  the  argument 

<where>.  The  options  for  this  argument  are: 

ARC  : fail  from  the  current  arc 

* : fail  from  the  current  state 

PUSH  : fall  from  current  level  (i.e.  to  most  recent  PUSH) 

TOP  : fall  from  entire  network  (i.e.  give  up  completely) 

<state>:  fill  back  to  the  state  named  <state> 

iii.s.d  suapcndlng  Aod  Resuming  Conputatlfloa 

Suppose  that  we  wanted  to  parse  a sentence  like: 

How  many  planes  were  there  which  required  no  repairs  in  April? 

In  this  sentence,  the  relative  clause  "which  required  no  repairs  in 
April"  has  been  extraposed  to  the  right  (or  the  head  "how  many 


planes"  extraposed  to  the  left,  if  you  like).  We  would  like  to  be 
able  to  parse  these  two  sub-strings  as  if  they  occurred  together,  as 
is  usually  the  case: 

...planes  which  required  no  oaintenance  in  April... 

Using  RESUME  and  RESUMETAG,  one  can  "remember”  where  one  stopped 
parsing  a noun  phrase  (e.g.  in  "How  many  planes")  and,  at  »ome  later 
point  in  the  input  string,  resume  parsing  the  same  noun  phrase  in  the 
same  state  and  with  the  same  register  environment. 

RESUMETAG  fllYfia  A configuration 

(RESUMETAG  '<state  naffle>) 

(RESUMETAG) 

The  RESUMETAG  function  returns  a "tag"  which  encodes  a state  naune  and 
a register  environment.  With  this  "tag",  one  can  later  resume 
computation  in  this  state  with  the  same  register  environment.  If  the 
<state  naffle>  argument  is  not  given,  the  current  state  is  used. 

RESUME  rfiauofia  l suspended  computation 

(RESUME  <tag>  . '<reglster  naffles>) 
eg:  (RESUME  Inptag  verb) 

The  RESUME  function  resumes  the  subcomputation  which  generated  <tag> 
(i.e.  with  a call  to  RESUMETAG)  in  the  current  position  in  the  input 
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string.  Optional  arguments  to  RESUME  specify  registers  whose 
contents  should  be  "sent"  to  the  lower  computation.  The  value 
returned  by  RESUME  is  the  value  POPped  by  the  lower  level. 


III.5.e  Hit 


facility 


The  HOLD  action  together  with  the  VIR  arc  provide  a natural  mechanism 
for  handling  constituents  which  are  parsed  "out  of  place"  in  a 
sentence.  The  HOLD  action  allows  one  to  place  a parsed  constituent 
on  the  HOLD  list.  At  some  later  time,  the  VIR  arc  can  be  used  to 
retrieve  the  held  object  and  process  it  as  if  it  were  found  at  that 
point  in  the  input  string. 

An  example  may  show  the  usefulness  of  this  facility.  English  syntax 
allows  for  the  "fronting"  of  the  "unknown  element"  in  a question. 
One  would  probably  like  a grammar  to  locate  the  place  from  which  the 
"unknown"  element  was  fronted.  One  strategy,  then,  is  to  place  the 
"unknown"  constituent  on  the  HOLD  list  (via  the  HOLD  action)  and  then 
write  the  grammar  such  that  it  checks  the  HOLD  list  whenever  it  can't 
find  a ccnstituent  in  the  input  string. 

For  example,  consider  the  following  sentences: 

Hho  did  John  see? 

Who  did  John  sit  next  to? 

Who  did  John  believe  that  Harry  saw? 

Mho  does  John  think  the  FBI  wanted  to  arrest? 
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These  sentences  can  be  handled  by  placing  the  constituent  generated 
by  the  "who"  on  the  HOLD  list  indexed  as  a Noun  Phrase.  At  each 
point  in  the  grammar  where  a NP  is  sought  as  either  the  subject  of  a 
sentence,  the  object  of  a sentence,  or  the  object  of  a preposition, 
we  include  a VIR  arc  which  examines  the  HOLD  list  for  a held  NP. 

HOLD  action 

(HOLD  <fonn>  <cat>  <features>) 
eg;  (HOLD  • ’np  nil) 

The  HOLD  action  places  the  expression  <form>  on  the  HOLD  list  indexed 
under  the  category  <cat>.  An  optional  third  argument,  <features>  is 
a list  of  features  to  be  associated  with  the  entry.  In  addition,  the 
level  at  which  this  action  is  being  done  is  recorded  with  the  entry. 
This  allows  a VIR  are  to  select  only  those  constituents  held  at  a 
particular  level . 

Uifi  VIRtual  are 

The  VIR  arc  allows  the  grammar  to  retrieve  an  element  from  the  HOLD 
list.  In  retrieving  the  element,  the  user  can  specify: 

1 The  lexical  category  under  which  the  element  was  held. 

2 A limit  on  the  nuaber  of  levels  above  the  current  one  at  which 
the  element  was  held. 

3 An  arbitrary  test  which  the  element  or  its  features  must  pass. 

The  syntax  frame  for  the  VIR  arc  is: 
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(VIR  <head>  <te3t>  ...  <actions>  ...  <destlnation>) 
where  <head>  can  have  either  of  the  following  two  forms: 

<cat> 

(<cat>  <level>) 

The  first  form  will  match  any  item  on  the  HOLD  list  indexed  under  the 
category  <cat>  regardless  of  the  level  at  which  it  was  held.  In  the 
second  form,  <level>  specifies  the  level(s)  at  which  a candidate  item 
was  held.  This  argument  should  be  a number  (0  or  positive)  which 
limits  the  number  of  levels  above  the  current  one  at  which  the  item 
was  held.  Thus: 

(VIR  (PP  0)  ...) 

will  match  items  indexed  as  a PP  which  were  held  at  the  current  level 
only.  Similarly, 

(VIR  (ADV  2)  ...  ) 

would  find  ADV's  held  at  the  current  level  or  the  next  two  higher 
levels. 

Once  a candidate  match  has  been  found,  the  <test>  argument  is 
evaluated  In  an  environment  where  * is  bound  to  the  candidate 
constituent  and  FEATURES  is  bound  to  the  (optional)  features  list 
associated  with  it.  If  <test>  is  non>NIL,  then  the  VIR  are  is  taken 
and  the  item  is  removed  from  the  HOLD  list. 


XII. 6 Useful  Aetlona 


III. 6. a Aotlona  which  teat  current  Mord 


A number  of  functlona  are  provided  for  teating  the  current  word. 
Theae  are  typically  uaed  aa  further  conditiona  on  an  arc,  for 
example: 

(PUSH  pp  (CAT  prep)  (TO  a tend)) 

or  aa  a condition  in  one  of  the  actiona  in  an  arc: 

(CAT  det  T (and  (OETF  queation) (LIFTR  type  'q))  ...  ) 

WRD  Gomoarea  the  current  word 

(WPD  '<word(8)>) 
eg:  (WRD  who) 

(WRD  (which  that  who)) 

the  WRD  function  returns  T if  the  current  word  (i.e.  •)  matches  its 
single  argument.  If  this  argument  is  an  atom  then  the  current  word 
must  be  identical  to  it.  If  it  is  a list  of  atoms,  then  the  current 
word  must  be  a member  of  this  list. 

CAT  eheeWa  the  lexical  category  of  the  current  mtCd 

(CAT  '<category>) 
eg:  (CAT  pro) 

(CAT  (n  pro  npr)) 

The  CAT  function  tests  the  current  word  to  see  if  it  can  be  a member 
of  the  lexical  category  <oategory>.  As  with  the  WRD  function,  the 
argument  can  be  an  atomic  category  name  or  a list  of  atomic  category 
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ROOT  checks  the  root  fora  the  current  word 

(ROOT  '<root(s)>) 

(ROOT  '<root(s)>  '<category>) 
eg:  (ROOT  be) 

(ROOT  (be  have)) 

The  ROOT  function  compares  the  root  form  of  the  current  word  to  Its 
argument.  Again,  the  argument  can  be  an  atom  or  a list  of  atoms.  It 
returns  T if  and  only  if  the  current  word  has  one  of  the  specified 
root  forms.  Thus  the  expression  (ROOT  be)  would  be  true  if  the 
current  word  were  be,  is,  am,  are,  was, etc.  Note  that  if  the  current 
word  is  homomorphic  between  two  words  with  different  root  forms,  both 
possibilities  will  be  explored.  Thus,  if  the  current  word  were  LIE, 
both  (ROOT  lay)  and  (ROOT  lie)  would  be  true. 

The  ROOT  function  takes  an  optional  second  argument  which  can  be  used 
to  limit  the  lexical  categories  that  are  considered  in  determining 
the  root  form  of  the  current  word. 

t 

gfiiA  JQ  inflectional  feature 

(GETF  '<feature  name>) 
eg:  (GETF  tense) 

GETF  returns  the  value  associated  with  the  feature  <feature  naffle>  for 
the  current  word.  If  the  current  word  has  no  such  feature,  then  NIL 
Is  returned.  These  features  come  from  three  sources: 

1 Features  on  the  dictionary  entry  for  the  current  word. 

For  example,  the  tense  and  number  features  for  the  word  IS  are 
coded  as: 


M2 


(IS  (BE  (tense  present) (number  Bss)) 

2 Features  generated  by  the  Dictionary  Manager 

For  regularly  Inflected  words,  certain  features  are 
automatically  generated. 

3 The  FEATURES  property  of  the  word 

Each  word  can  have  a FEATURES  property  associated  with  It. 
For  example,  the  word  GIVE  might  have  the  features  property: 
(TRANSITHE  INDIRECTOBJECT  PASSIVE) 

where  these  features  might  mean  that  GIVE  is  a transitive  verb 
which  can  take  an  Indirect  object  and  can  be  used  In  a passive 
sentence. 

In  cases  one  and  two  above,  a feature  can  be: 

1 A tuple  whose  first  element  is  the  name  of  the  feature  and 
whose  second  Is  the  value. 

2 A list  of  Just  the  feature  name.  The  value  of  this  feature  Is 
taken  as  T. 

3 An  atomic  feature  name.  The  value  Is  assumed  to  be  T. 

III.6.b  Aetiona  which  leat  Acaitr4rY  words 

CMCtP  cheeks  iq  Inflectional  feature 
(C8ECKF  '<feature>  <word>  *<cat>) 


The  CHECK? 
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function  is  analogous  to  GETF,  differing  only  In  that  it 
applies  to  an  arbitrary  word  given  as  its  argument.  CHECKF  returns 
the  value  of  the  feature  <feature>  associated  with  the  word  <word>. 
An  optional  third  argument,  <cat>,  determines  the  lexical  category 
assumed  for  the  word.  Thus; 

(CHECKF  transitive  (GETR  verb)  'v) 

returns  the  value  of  the  TRANSITIVE  feature  for  the  word  in  the  VERB 
register  when  it  is  interpreted  as  a verb.  Similarly, 

(CHECKF  tns  • 'v) 

returns  the  TNS  feature  for  the  current  input  word  when  interpreted 
as  a verb. 

III.6.C  Miscellaneous  Actions 

The  BUILDQ  function  provides  a convenient  method  for  constructing 
arbitrary  trees  which  contain  values  held  in  registers.  It  is 
modeled  after  the  function  described  in  [Woods  71],  its  syntax  being: 
(BUILDQ  '<8trueture>  . '<fillera>) 

The  first  argument,  <structure>,  Is  an  arbitrary  list  structure  which 
can  contain  special  symbols  at  its  leaves.  The  function  returns  a 
similar  structure  by  replacing  these  symbols  by  values  computed  from 
the  remaining  elements,  <flllers>. 

In  filling  the  structure,  BUZLOO  traverses  the  structure  in 
"preorder".  If  the  symbol  "4'"  is  encountered,  then  the  next  filler 
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argument  is  taken  as  the  name  of  an  ATN  register  and  its  contents  are 
substituted  for  the  If  the  symbol  is  encountered,  it  is 

replaced  by  the  current  value  of  the  atom  On  encountering  the 

symbol  the  next  filler  argument  is  evaluated  and  the  resulting 

value  used  to  replace  the  If  the  symbol  ”§"  is  found  as  the 

first  element  of  a list,  then  the  remaining  elements  of  that  list  are 
interpreted  as  above  and  the  results  are  APPENDed  together. 

A few  examples  may  make  the  operation  of  BUILDQ  clear.  Suppose  we 

have  the  following  register  context: 

DET  the 

NOUN  man 

PRE-MODIFIERS  (large  heavy) 

POST-MODIFIERS  ((PP  (in  the  park))) 

Then  if  we  evaluate: 

(BUILDQ  (NP  # (N  ♦)  (DET  ♦))  (gensym)  noun  det) 
we  would  get: 

(NP  G0027  (N  man)  (DET  the)) 

Evaluating  the  expression: 

(BUILDQ  (NP  # (N  4.)  (e  (MOD)  4-  ♦)) 

(gensym)  noun  pre-modifiers  post-modifiers) 

would  yield  the  expression: 

(NP  (N  man) 

(HOD  large  heavy  (PP  (in  the  park)))) 
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The  NBXTWRD  function  returns  the  "on  deck"  input  word  (i.e.  the  CAR 
of  STRING).  If  STRING  is  empty,  it  returns  NIL. 

III. 7 Tracing  Eagiiitlfia 

Every  programming  language  should  provide  adequate  debugging  tools  to 
enable  the  user  to  perfect  his  programs.  Since  the  ATN  should  be 
viewed  as  a kind  of  programming  language,  we  have  provided  powerful 
tracing  facilities  which  we  have  found  to  be  useful  and  necessary  to 
build  and  debug  large  grammars. 

Functions  are  provided  which  will  print  a trace  of  the  developing 
parse  and  allow  the  user  to  specify  arbitrary  points  at  which  to 
suspend  computation  and  enter  a "break  point".  A facility  is  also 
provided  by  which  the  user  can  monitor  the  values  being  assigned  to 
particular  registers. 

TRACE-STATE  traCtfl  AlH  ALLAlffi 

The  function  TRACE-STATE  allows  the  user  to  trace  the  flow  of  control 
thru  specified  states  and  optionally  break  upon  entering  them.  If  a 
state  is  being  traced,  the  message: 
in  state  <state  name> 

will  be  displayed  on  the  TTY  when  the  state  is  entered.  If  the  state 
falls,  I.e.  no  arcs  from  the  state  lead  to  a final  state,  the 
following  message  will  be  displayed: 
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falling  from  state  <state  neime> 

If  the  user  associates  a BREAKPOINT  with  a state,  then  the  message: 
;BKPT  <state  name> 

will  be  typed  on  the  TTY,  computation  suspended,  and  a LISP 
breakpoint  entered.  Typing  an  "$P"  will  resume  the  ATN  computation. 

The  syntax  of  the  TRACE-STATE  function  Is: 

(TRACE-STATE  '<arg1>  '<arg2>  ...  •<argn>) 
where  each  argument  specifies  one  or  more  states  to  be  traced  and 
optionally  associated  with  a breakpoint.  The  arguments  are 
Interpreted  as  follows: 

<state  naffle>  : 

a 

(<state  name>  BREAK  <test>)  : 

a 

(•  BREAK  <test>)  : 

a 

UNTRACE  untraces  ^ states 

The  function  UNTRACE-STATE  removes  states  from  the  set  of  ATN  states 
to  be  traced.  Its  syntax  Is: 

(UNTRACE-STATE  •<arg1>  ...  •<argn>) 

where  each  argument  specifies  one  or  more  states  to  be  "untraced”. 
If  an  argument  la  the  name  of  an  ATN  state  being  traced,  then  that 
state  Is  removed  from  the  list.  The  argument  * causes  all  states  to 
be  removed  from  the  list  of  states  to  be  traced. 


trace  the  named  state 
trace  all  states 
trace  the  state  and  enter 
breakpoint  if  <test>  is  true, 
trace  all  states  and  enter 
breakpoint  if  <test>  is  true. 
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Reglatera  .saQ  traced  ^ well 


We  provide  a aifflUar  feature  whereby  one  can  monitor  the  valuea 
aaaigned  to  regiatera.  The  function  TRACE-REG  cauaea  a apecified  ATN 
regiater  to  be  "traced".  Whenever  a value  la  aaaigned  to  that 
regiater  with  a SETR,  the  meaaage: 

aettlng  regiater  <regiater  name>  to  <value> 
will  be  typed  on  the  TTY.  Similar  meaaagea  are  dlaplayed  when  a 
regiater  ia  aet  via  the  functions  LIFTR  or  SENDR. 

The  ayntax  of  the  function  TRACE-REG  ia  aimllar  to  that  of  TRACE- 
STATE,  being: 

(TRACE-REG  ’<argl>  •<arg2>  ...  *<argn>) 

An  argument  can  be  the  name  of  a regiater,  the  special  "wildcard"  *, 
or  a three-tuDle  (<register  name>  BREAK  <teat>).  The  last  case 
allows  one  to  trace  the  setting  of  a register  and  enter  a BREAKPOINT 
if  the  arbitrary  lisp  expression  <test>  is  true. 

The  function  UNTRACE-REG  is  used  to  remove  specified  registers  from 
the  list  of  registers  being  traced.  Its  argument  convention  is 
similar  to  that  of  UNTRACE-STATE. 


III. 8 Global  variables 


The  following  is  a list  of  the  most  Important  global  variables  in  the 
interpreter . 
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The  current  constituent 


The  global  variable  * 

is 

always  bound 

to  the 

"current 

constituent.  Precisely 

what  this  is  depends  on  the 

context. 

Within  a CAT  arc,  it 

is 

bound  to  the 

root  form 

of 

the 

current  word . Within 

a 

PUSH  arc  it 

is  bound 

to 

the 

constituent  POPped  by  the 

sub-computation  (except 

in 

the 

"preactions"  where  *•  is  the  current  word).  In  a VIR  arc,  * 
is  bound  to  the  element  being  removed  from  the  HOLD  list. 
For  the  other  arc  types,  it  is  bound  to  the  current  word 
(i.e.  LEX). 

In  several  other  contexts,  * is  temporarily  bound  to  other 
elements.  For  example,  when  evaluating  the  <test>  in  a VIR 
are  or  in  a (GETR  <register>  NEAREST  <teat>),  • is  bound  to 
the  candidate  element. 

LEX  The  current  word 

The  atom  LEX  is  always  bound  to  the  current  word,  exactly  as 
^ it  appears  in  the  input  string.  If  the  input  string  is 

exhausted,  then  LEX  is  bound  to  NIL. 

i 

STRING  The  input  string 

The  atom  STRING  is  always  bound  to  a list  of  the  words 
remaining  in  the  input  string. 

STACK 
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STACK  is  a list  of  the  names  of  states  PUSHed  to  which  have 
not  popped.  This  can  be  used  to  determine  how  deeply 
embedded  the  current  computation  is.  For  example,  the 
following  POP  arc  will  only  be  taken  if  we  will  be  returning 
to  the  top  level  and  the  input  is  exhausted; 

(POP  (buildup) 

(AND  (NULL  stack) (NULL  *)(NULL  string))) 

The  following  additional  global  variables  may  be  of  some  use  to  the 
user: 

ALIST  An  association  list  which  encodes  the  contents  of  the 
registers,  both  current  and  past.  Elements  of  ALIST  are 
tuples  whose  CAR  is  the  name  of  a register  and  whose  CDR  is 
a value  associated  with  that  register. 

STATE  In  the  interpreter,  STATE  is  bound  to  the  current  state  that 
the  ATN  is  in. 

ARC  Bound,  in  the  interpreter,  to  the  current  arc  being 
followed . 

USE-ALL-WOROS?  This  variable  controls  the  ability  to  POP  to  the 
TOP'LEVEL  when  words  remain  in  the  input  string.  This  is 
permitted  only  when  USE*ALL-W0R0S7  is  bound  to  NIL. 

ALIST-STACK  A stack  of  the  values  of  ALIST  for  higher  levels  of 
computation. 
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ARC-STACK  A stack  of  pending  actions  on  any  active  PUSH  arcs  (l.e. 
those  which  have  not  yet  been  popped  to). 

POPSTACK  A stack  of  tags  to  the  active  PUSH  arcs. 
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IV. 1 Introduetlor 

The  purpose  of  the  ATN  Compiler  Is  to  translate  ATN  networks  Into 
pure  LISP  code  which  can  be  executed  directly.  This  code  can 
subsequently  be  translated  Into  machine  language  code  by  a standard 
LISP  compiler.  In  fact,  the  ATN  compiler  has  been  written  In  such  a 
way  as  to  facilitate  the  direct  compilation  of  ATN  networks  Into 
machine  language  by  the  MacLISP  compiler  [Moon  74]. 

AdYftOtagfia  JiZ  compilation 

The  advantages  of  compiling  an  ATN  network  are  the  same  as  those  of 
any  compilation  process:  reduced  execution  time  and  storage  space. 
We  have  applied  our  compiler  on  a modified  version  of  the  LUNAR 
English  grammar  [Woods  72]  and  found  that  the  execution  time  for  the 
resulting  LISP  code  was  decreased  by  about  a factor  of  2.  The 
resultant  machine  language  code  achieved  a speed  up  factor  of  5 to 
10. 

The  compiler  is  able  to  preserve  all  of  the  features  available  under 
the  interpreter.  A grammar  writer  need  take  no  special  steps  or 
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considerations  if  he  intends  to  compile  his  network.  In  other  words, 
if  one  has  a system  of  ATN  networtcs  which  behave  correctly  in  the 
Interpreter,  then  compiling  these  networks  without  modifications  will 
result  in  compiled  code  which  also  behaves  correctly 

Saeclal  Fcalurea  ^ iht  Conplier 

A compiled  ATN  network  is  more  efficient  mainly  because  the 
interpretation  stage  is  by>passed.  This  compiler  also  achieves 
greater  efficiency  thru  four  additional  processes: 

1 The  resulting  LISP  code  is  locally  "optimized"  by  a simple 
pattern  mat'ihing  scheme. 

2 Certain  common  actions  and  arc  conditions  are  "open  coded" 
into  more  primitive  LISP  code. 

3 The  compiler  automatically  detects  local  contexts  in  vrtilch 
certain  ATN  features  are  not  being  used  <e.g.  explicit  failure 
via  the  FAIL  function).  IN  these  contexts,  code  to  handle 
these  features  need  not  be  generated. 

U The  user  is  able  to  globally  "turn  off"  a number  of  general 
ATN  features  which  he  is  not  using.  This  causes  the  compiler 
not  to  produce  the  code  necessary  to  implement  these  features. 
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IV.I.a  Compilation  ^ ^ atate 

The  baaic  organization  of  the  ATN  compiler  la  shown  in  figure  4.  The 
function  00MPILESTATE1  is  applied  to  each  state  in  the  network.  It 
produces  a LISP  function  which  will  simulate  the  action  of  the  state 
when  called.  In  doing  so,  It  applies  the  function  COMPILEARC  to  each 
arc  in  the  state.  The  resulting  code  for  each  arc  is  combined  with 
additional  code  to  produce  the  function. 

The  basic  frame  for  an  ATN  state  function  is: 

(DEFUN  «state  nane>  ATNEXPR  ATNSUBR)  NIL 
<traeing  code> 

(CATCH  (PROGN  <code  for  arc  1> 

<code  for  arc  2> 

...) 

< state  nane>  ) 

Xmore  tracing  code> 

•PAIL  ) 

The  (CATCH  ...  <state  name>)  construct  is  included  to  implement  the 
ability  to  fail  to  a named  state  from  a later  point.  The  inclusion 
of  the  tracing  code  and  the  (CATCH  ...  <state  name>)  expression  is 
under  the  control  of  the  user  thru  switches  described  in  a later 
section. 

Onoe  this  oode  has  been  produced,  an  optimization  procedure  is 
applied  to  the  function.  The  result  is  then  evaluated  if  we  are  in 
the  LISP  interpreter  or  passed  to  the  MacLISP  compiler  if  it  is 
resident . 


P 


Figiire  4 

Ovtrvlcw  of  the  ATN  Compiler 
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iv.i.b  comniletion  siL  An  individual  anc 

The  function  COMPILEARC  is  responsible  for  writing  an  expression  for 
an  individual  arc.  Its  Job  is  to: 

1 verify  that  the  arc  is  in  a legal  format. 

2 dispatch  the  arc  to  the  appropriate  specialist  for 

coopllatlon. 

Currently  the  verification  is  a simple  check  on  the  last  action  of  an 
arc.  For  arc  types  POP,  TO,  JUMP  and  DO  this  action  should  not 
specify  a destination  (i.e.  call  TO  or  JUMP).  For  the  rest  of  the 
arc  types,  the  terminal  arc  should  be  a destination.  If  this 

condition  is  not  met  a warning  message  is  displayed. 

Each  arc  compiler  specialist  compiles  one  type  of  arc.  It  examines 

the  condition  and  actions  and  produces  a LISP  expression  which 

simulates  that  arc.  The  general  form  of  the  code  for  an  arc  is  given 
below: 

(CATCH  ((LAMBDA  (ALIST)  (AND  <arc  dependent  code> 

<test> 

(PROON  . <aetlona>))) 

ALIST) 

ARCFAILURE) 

Tb«  (CATCH  ...  ARCFAILURE)  construction  is  necessary  to  handle  a 
possible  (FAIL  'ARC)  in  the  <aetion8>  or  the  <test>.  The  ((LAMBDA 
(ALIST)  ....  ) ALIST)  is  required  to  Isolste  the  possible  side 

effects  caused  by  the  are. 
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Before  adding  these  two  fragments,  the  arc  compiler  checks  to  see  If 
they  are  really  necessary.  The  <test>  and  <actlons>  are  examined  to 
see  If  they  might  contain  a call  to  a function  with  a side  effect 
(e.g.  a call  to  SETR)  or  a call  to  the  FAIL  function.  In  examining 
the  expressions,  the  compiler  must  he  aware  of  which  expressions  are 
"executable"  and  which  are  potentially  "data". 


The  following  conservative  algorithm  Is  used  to  determine  whether 
evaluating  an  s-expresslon  will  generate  a side  effect. 

1 Trivial  case 

Atoms,  of  course,  can't  cause  side-effects.  Return  NIL. 

2 Known  offenders 

Some  functions  are  known  to  cause  side  effects  (e.g.  SETR). 
Return  T. 

3 Guilty  until  proven  Innocent 

If  the  function  Is  unknown  (l.e.  one  which  the  user  has  written) 
then  assume  It  might  generate  a side  effect.  Return  T. 

4 Fexpers,  etc. 

If  the  function  has  no  arguments  or  Is  known  not  to  evaluate  any 
of  Its  arguments,  then  return  NIL. 

5 Recurse 

Recursively  apply  this  procedure  to  those  arguments  which  are 
evaluated.  If  any  generates  a side-effect,  then  return  T,  else 
return  NIL.  Here  the  compiler  must  know  how  a function 
evaluates  Its  arguments.  For  the  standard  built-in  LISP 

functions  and  the  built-in  ATN  functions  and  actions  this 

Information  Is  stored  for  each  function  In  an  "evaluation 
template". 


An  analogous  procedure  is  used  to  determine  If  evaluating  an 
expression  might  generate  a failure. 


If  the  arc  contains  no  expressions  which  might  have  a side  effect 


then  the  LAMBDA  expression  Is  omitted.  If  no  FAIL  action  Is  found, 
then  the  CATCH  Is  omitted. 

Another  trick  used  by  the  compiler  to  Increase  efficiency  is  to  delay 

the  CATCH  (If  any)  and  LAMBDA  binding  (if  any)  until  the  last 

possible  moment.  Thus,  the  following  WRD  arc: 

(WRD  because  (getr  obJ)(setr  conj  •)(to  siconjoined) ) 

is  compiled  as: 

(AND  (EQ  * 'because) 

(GETR  obj) 

((LAMBDA  (ALIST)  (SETR  conJ  •)(T0  s : conjoined ) ) 

ALIST)) 

Note  that  the  LAMBDA  binding  Is  done  only  If  the  arc  Is  actually 
taken  and  that  no  CATCH  was  generated  since  no  PAIL  action  was 
contained  in  the  arc. 

IV.I.e  Optimization 

The  optimization  phase  of  the  compiler  was  originally  added  to  make 
the  compiler  easier  to  write  and  debug.  The  idea  was  to  simplify  the 
code  generation  process  for  each  of  the  arcs.  In  compiling  an  arc, 
we  needn't  produce  efficient  code,  but  merely  straightforward  code 
which  Is  guaranteed  to  work  properly.  Once  the  code  has  been 
generated,  a standard  "optimizing"  routine  is  applied  to  it  to 
simplify  the  expression. 


The  General  Procedure 
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The  basic  procedure  to  optimize  a LISP  s-expression  it  to  traverse 
the  expression  (which  is  Just  a tree,  after  all)  in  "post  order", 
applying  a simplification  algorithm  at  each  node.  Thus  the 
expression  is  optimized  "ft'orn  the  bottom  up",  i.e.  the  arguments  to  a 
function  are  optimized  before  the  function  call  itself  is  optimized. 
We  must  take  care,  however,  that  the  optimization  procedure  is  only 
applied  to  "executable"  nodes  and  not  to  "data  nodes".  An 
"executable"  node  is  one  which  is  known  to  be  evaluated  as  a LISP 
expression.  All  other  nodes  are  assumed  to  be  "data"  nodes. 

Thus,  the  optimizer  does  not  attempt  to  optimize  a function  call 
unless  that  function  is: 

1 known  to  be  a EXPR,  SUBR,  LEXPR  or  LSUBR 

2 known  to  be  a FEXPR  or  FSUBR  which  evaluates  all  of  its 
arguments  (e.g.  AND,  OR,  etc.) 

3 a special  FEXPR  or  FSUBR  that  the  optimizer  "understands". 

The  functions  covered  above  under  point  three  are  those  which  are 
known  to  evaluate  some  of  these  arguments  in  a special  manner.  For 
example,  the  optimizer  knows  that  the  function  PRX  evaluates  all  but 
its  first  argument  and  that  the  arguments  to  the  (XJND  function  are 
lists  of  expressions  to  be  optimized. 

Simplification 

Once  the  arguments  to  a f\inction  have  been  optimized,  the  function 
call  is  simplified.  This  simplification  is  driven  by  a set  of 
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production-like  rules  which  specify  code  transformations.  These 
transformations  simplify  the  code  but  do  not  change  its  behavior. 
These  rules  are  three- tuples  which  have  the  form: 

(<pattern>  <result>  <reatrictions>) 

Each  rule  states  that  if  a LISP  expression  matches  <pattern>  and 
<restrictions>  evaluates  to  non-NIL,  then  it  can  always  be  replaced 
by  the  simpler  s-expression  <result>. 


The  <pattern> 

and 

<result> 

elements  of 

a 

rule 

are  general  s- 

expressions 

in  which  the 

characters 

« 

and 

? have  special 

significance. 

The 

atom  "?" 

is  allowed 

to 

match 

any  single  s- 

expression.  The  atom  may  match  zero  or  more  "sibling"  s- 

expressions.  Any  atom  whose  first  character  is  a "?"  or  a 
behaves  in  a similar  manner  except  that  that  atom  is  bound  to  the  the 
s-expression(s)  matched. 

As  an  example,  consider  the  pattern: 

(PROG  ?A  *8  (RETURN  •)) 

when  matched  against  the  following  expression: 

(PROG  (A  B C) 

(SETQ  A (TIMES  X X)} 

(SETQ  B (TIMES  X A)) 

(RETURN  (LIST  A B))) 

The  match  succeeds  and  has  the  side  effects  of  setting  the  variables 
?A  and  <>8  to  (A  B C)  and  ((SETQ  A (TIMES  X X))(SETQ  B (TIMES  X A))), 


respectively. 


(AND) 

(AND  "A  T "B) 

(AND  ?A) 

(AND  •A  (AND  *8)  *0 
(APPEND  •a  (list  *8 ) (list  "O 


NIL 

(AND  ‘A  *B)  (OR  ‘A  •B) 

?A 

(AND  *A  *8  *C) 

*D)  (APPEND  ‘A  (LIST  “B  "C)  "D) 


(CAR  (CAR  ?A)} 

(OOND) 

(COND  •A  (NIL  •)  “B) 

(COND  •A  (T  •B)  *0 
(OOND  (?A  ?B)) 

(OOND  (T  •A)) 

(COND  (?A)) 

(COND  ^E  (?A  *8  (PROGN  •C)  •D) 
(CONS  ?A  NIL) 

(DEFUN  7 A ?B  *0  (PROGN  “D)  *E) 
(DEFUN  ?A  ?B  *0  NIL  •D) 

(DO  ?A  ?B  "C  (PROGN  *0)  •E) 

(EQ  ?A  NIL) 

(EQUAL  ’?A  ?B) 

(EQUAL  ?A  '?B) 


(CAAR  ?A) 


NIL 

(COND  “A  “B)) 

(COND  •A  (T  •B))  *0 

(AND  ?A  ?B) 

(PROGN  •A) 

?A 

•F)(COND  "E  (?A  •B  •€  *0)  •F))) 
(NCONS  ?A) 

(DEFUN  ?A  ?E  "C  •D  •£) 

(DEFUN  ?A  ?B  •€  •D) 

(DO  7 A ?B  •€  •D  •E) 

(NULL  7 A) 

(EQ  •7A  ?B)  (ATOM  ?A) 

(EQ  ?A  '?B)  (ATOM  ?B) 


(LAMBDA  7 A »B  (PROGN  *0  *D) 
(LIST  ?A) 

(PROG  7A  *8  (PROGN  *0  *0) 
(PROGN  7 A) 

(PROGN  •A  (PROGN  *8)  *0 


(LAMBDA  7 A *8  *0  »D) 
(NCONS  7 A) 

(PROG  7A  *8  •€  *0) 

?A 

(PROGN  •A  *8  ‘O 


(OR) 

(OR  ?A) 

(OF  •A  T •) 

(OR  •A  NIL  *8) 


NIL 

?A 

(OR  •A) 

(OR  •A  *8) 


Figure  5 

Simplification  Rules 


Figure  5 gives  a list  of  some  of  the  simplification  rules  used  In  the 
ATN  compiler. 


Application  of  the  simplification  rules 


The  simplification  rules  are  applied  according  to  the  following 
algorithm: 

1 Collect  the  rules  which  may  apply. 

The  rules  are  indexed  by  function  name  (l.e.  the  CAR  of  the 
expression) . 

2 Find  a rule  which  matches  the  expression. 

The  matcher  compares  the  <pattern>  and  <restrietlon>  of  each 
rule  against  the  data  until  a match  is  found. 

3 If  no  matching  rule  was  found,  then  stop  else  continue. 

4 Replace  the  expression. 

Replace  the  expression  with  the  instantiation  of  the  <result> 
element  of  the  rule  found  In  step  2. 

5 Continue  with  step  1. 

Since  the  function  being  called  may  have  changed,  we  must  start 
from  the  top. 


Figure  6 shows  an  example  of  the  simplification  process. 


IV. 2 Using  the  ATK  Compiler 


The  ATN  compiler  can  easily  be  used  to  produce  executable  LISP  code 
in  the  LISP  interpreter  or  to  produce  machine  language  code  in 
conjunction  with  the  standard  MacLISP  compiler.  In  either  case,  the 
user  can  selectively  disable  unused  features  to  achieve  faster 
execution  time.  This  is  discussed  in  section  IV.3<c. 
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(PROGN  (AND  T (CX)ND  ((>  X 10.)(SETQ  X 10.)))) 

t 

I 

! (cond  (?a  ?b))  ==>  (and  ?a  ?b) 


(PROGN  (AND  T (AND  (>  X 10.)(SET0  X 10.)))) 

T 

! (and  *a  (and  “b)  *c)  ==>  (and  “a  •b  *c) 


(PROGN  (AIJD  T (>  X 10.)(SET0  X 10.))) 

t 

I 

I (and  *8  t •b)  ==>  (and  *8  •b) 


(PROGN  (AND  0 X 10.)(SETQ  X 10.))) 

T 

I 

I (proRn  ?a)  ==>  ?e 


(AND  (>  X 10.)(SETQ  X 10.)) 


Figure  6 

An  Example  of  Simplification 


IV. 2. a Compiling  networks  Into  LISP  Code 


This  section  describes  how  one  can  compile  networks  Into  LISP  code 
given  he  has  a LISP  Into  which  the  ATN  Interpreter  has  been  loaded 
and  the  networks  defined.  The  ATN  compiler  can  be  loaded  by 
evaluating  the  expression: 

(FASLOAD  ATNCOMPILER  FAS  COi:) 

The  user  can  then  set  whatever  switches  he  wants  (using  SETO).  A 
single  ATN  state  can  be  compiled  with  the  function  COMPILESTATS. 
This  function  takes  one  (unevaluated)  argument,  the  name  of  the  ATN 
state  to  be  compiled.  The  function  defines  an  eculvalent  function 
which  will  be  called  whenever  control  is  transferred  to  that  ATN 
state. 

The  function  COMPILEATN  can  be  used  to  compile  an  entire  ATN  network. 
Its  single  (unevaluated)  argument  should  be  the  name  of  the  ATN 
network  to  be  compiled.  It  simply  applies  COMPILESTATS  to  each  of 
the  states  found  In  the  network. 

IV. 2. b Compilation  in  the  compiler 

To  compile  a file  of  ATN  networks  with  the  standard  MacLISP  compiler 
one  need  only  ensure  that  the  ATN  compiler  Is  loaded  Into  the  MacLISP 
compiler.  The  easiest  way  to  do  this  Is  to  Include  the  following 
expression  at  the  top  of  the  file: 


r • 

I 


r 

r 


I 

I 

I 

i 


I 


11 

n 

n 

11 

n 

n 

n 

n 

n 

n 

n 

n 


(DECURE  (FASLOAD  ATNCOMPILER  FAS  COK)) 

This  will  cause  the  ATN  compiler  to  be  automatically  loaded  when  the 
file  is  compiled. 

Switches  may  be  set  in  a similar  manner.  For  example,  one  might 

include  the  expression: 

(MICLARE  (FASLOAD  ATNCOMPILER  FAS  COM) 

(SETQ  ATNTRACE?  NIL)  ;no  tracinc 

(SETO  FAILARC7  NIL))  ;no  (FAIL  'ARC)  actions 

in  the  file  containing  the  networks. 

Interactions  with  NCQMPLR 

If  the  ATN  networks  call  any  user  defined  functions  they  may  have  to 
be  declared  to  NCOMPLR.  Auxiliary  functions  must  be  declared  if  they 
are  FEXPRs,  FSUBRs,  LEXPRs  LSUBRs,  or  MACROS.  For  an  explanation  of 
how  to  do  this,  consult  [MOON  74]. 

IV.2.C  Switches 

The  following  switches  control  the  generation  of  code  by  the  ATN 
compiler.  For  each  switch  the  default  value  is  given  after  its  name. 

[1]  ATNTRACE?  T 

If  ATNTRACE?  is  T then  code  which  allows  states  to  be  traced 
(i.e.  via  TRACE-STATE)  will  be  generated.  A NIL  value  supresses 
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the  generation  of  this  code.  Thus,  this  switch  might  be  set  to 
NIL  if  the  network  is  fully  debugged  or  a production  module  is 
desired. 

[2]  ATNFAIURC?  T 

If  ATNFAILARC?  is  T then  code  will  be  generated  which  allows  one 
to  use  the  (FAIL  ’ARC)  action.  If  NIL,  then  this  code  is  not 
generated . 

[3]  ATNFAILSTATE?  T 

Code  which  allows  one  to  fail  to  a named  state  is  generated  if 
this  switch  is  T.  If  NIL,  then  the  code  is  not  generated. 
Recall  that  one  can  fail  to  a named  state  with  the  FAIL  arc  ot 
the  FAIL  action. 

[!»)  AWFAILPUSH?  T 

If  T , then  code  will  be  generated  which  allows  one  to  fail  from  a 
sub-computation  (i.e.  to  the  last  PUSH  are  or  PUSHATN  action). 

If  NIL,  then  the  code  will  not  be  generated 

[51  ATHPAIL?  T 

This  switch  has  no  effect  if  it  has  the  value  T,  but  a NIL  value 
disables  all  explicit  failure.  Thus  if  ATNFAIL?  is  NIL,  this 
implies  that  ATNFAILARC  » ATNFAILSTATE  « ATNFAILPUSH  « NIL. 

[6]  ATNSIMPLECAT?  NIL 

This  switch  controls  the  code  generated  for  a CAT  arc  and  the  CAT 
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action.  If  it  is  T,  then  it  is  assumed  that  each  word  will  have 
at  most  one  entry  for  a particular  lexical  category  in  the 
dictionary.  If  this  is  so,  the  compiler  need  not  set  up  a 
decision  point  for  each  CAT  arc  or  action. 
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V 

mifion 


V.1  Introduction 

The  lexicon  consists  of  two  parts:  the  DICTIONARY  which  contains 
words  and  certain  features  associated  with  then  and  the  DICTIONARY 
MANAGER  trtiich  is  a paokafje  of  procedures  for  accessing,  maintaining, 
and  updating  the  dictionary. 

A dictionary  entry  for  a word  typically  consists  of  the  syntactic 
category  that  the  word  belongs  to  and  a list  of  syntactic  features 
for  the  word  when  interpreted  under  that  category.  It  is  a fact  of 
English,  as  in  most  other  languages,  that  many  words  allow  multiple 
senses  for  a single  word.  In  these  cases,  multiple  entries  can  be 
made  in  the  Dictionary.  For  example  the  word  CRASH  might  have  the 
following  entries: 

CRASH  N -es 

CRASH  V -es-ed  (Intransitive) 


This  says  that  crash  can  be  interpreted  as  a noun  (N)  whose  plural  is 
formed  be  adding  the  suffix  "es”  or  as  an  Intransitive  verb  (V)  whose 
inflectional  forms  can  be  generated  by  adding  the  suffixes  "es"  or 
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*ed" 


Initially,  no  entries  need  exist  for  the  regularly  Inflected  forms  of 
words.  Such  forms  are  discovered  and  entries  for  them  generated  as 
they  are  needed.  for  example,  if  the  input  contains  the  word 
CRASHED,  a set  of  programs  (Morphology  Specialists)  are  invoked  which 
discover  that  CRASHED  is  the  past  tense  form  of  the  root  word  CRASH. 
These  programs  insert  the  lexical  entry: 

CRASHED  V (CRASH)  (tense  past) (Intransitive) 

The  words  CRASHES  and  CRASKINC  are  similarly  recognized  and  their 
lexical  entries  generated. 

V.2  liifi  Dictionary  fAXSial 

The  dictionary  entry  for  a word  is  stored  on  the  LISP  property  list 
for  that  atom  (i.e.  the  word).  The  standard  property  names  which 
might  be  used  are  atoms  representing  lexical  categories  (e.g.  N for 
noun,  DET  for  determiner,  etc),  and  three  special  atoms:  COMPOUNDS, 
SUBSTITUTE,  and  FEATURES. 

For  each  lexical  category,  <C>,  a word  sight  have  a property  indexed 
under  the  indicator  <C>.  The  value  of  this  property  depends  on  the 
particular  category  and  the  relationship  between  the  word  and  its 
root  fora.  These  are  discussed  in  the  next  section. 


The  value  of  the  COMPOUNDS  property  for  a word  is  a tree  which 
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represents  pre-deflned  compound  phrases  which  that  word  may  start. 
Associated  with  each  phrase  In  the  tree  is  a word  which  that  phrase 
should  be  mapped  Into.  The  user  need  not  concern  himself  with  the 
exact  representation  of  this  tree  as  the  functions  DEPPHRASE  and  DEPP 
(described  later)  can  be  used  to  create  It. 

The  value  of  the  SUBSTITUTE  property  for  a word  is  a list  of  ohrases 
which  should  replace  that  word.  Thus  It  forms  a kind  of  Inverse  to 
the  COMPOUNDS  mechanism.  This  property  Is  discussed  further  under 
the  ftinctions  DEPSUBSTITUTE  and  DEPSUB. 

The  value  of  the  PEATURES  property  Is  a list  of  atoms  or  sublists 
which  name  "features"  to  be  associated  with  the  word.  These  features 
may  be  accessed  in  several  ways  within  the  ATN  Interpreter  (see,  for 
example,  the  functions  GETP  and  CHECKP). 

The  value  of  a lexical  category  property  for  a word  specifies  the 
root  form(s)  of  the  word  and  a set  of  inflectional  features  to  be 
associated  w:Lth  the  word.  The  exact  encoding  of  this  information  is 
as  follows: 

1 If  the  value  is  a list  whose  first  element  is  atomic , then  the 
first  ela^nt  is  the  root  form  of  the  word  and  the  remaining 
elements  are  the  inflectional  features.  Por  example: 

BLEW  V (blow  (tns  past)) 

says  that  the  word  BLEW  is  the  past  tense  of  the  verb  BLOW. 

2 If  the  value  is  a list  whose  first  element  is  a sub-list,  then 
the  word  has  two  possible  root  forms  when  interpreted  und«*  this 
lexical  category.  Each  sub-list  should  be  of  the  form  discussed 
in  point  one  above.  AN  example  is: 


70 


I 

! 

I 

I 


n 


j 

j 

f 


I 

I 


I 


1 

(' 

I 

n 

ii 

[1 

!i 


f; 

n 

Fi 

n 

n 

n 

n 

n 


SAW  V ((see  (tns  past)) 

(saw  (tns  present) (untensed) )) ) 

This  property  states  that  the  word  SAW  is  either  the  past  tense 
fonn  of  the  verb  SEE  xi£  the  presen tuntensed  form  of  the  verb  SAW 
(what  we  do  to  a board), 

3 If  the  value  Is  atomic,  then  the  word  Itself  Is  taken  as  the 
root  form  and  the  Inflectional  features  are  supplied  by  the 
dictionary  manager.  Exs  tly  what  features  are  supplied  Is 
described  In  the  next  section.  Some  examples  are: 

COOK  V s-ed 
N >s 

FAST  ADJ  er-est 
V s-ed 
N -s 


Ulfi  meaning  ^ atomic  gatCgOnt  ValUCa 


The  lexical  categories  N (noun),  V (verb),  ADJ  (adjective),  and  ADV 
(adverb)  are  treated  In  a special  manner  by  the  dictionary  manager. 
An  atomic  value  for  one  of  these  properties  can  specify  that  the  word 
fits  one  of  a number  of  regularly  Inflected  paradigms. 


For  nouns,  the  following  possibilities  exist: 

* the  noun  Is  Irregular. 

>s  the  plural  Is  formed  by  adding  an  "s". 

•es  the  plural  is  formed  by  adding  an  ”es"  or,  if  the  noun  ends 
in  a "y”,  by  deleting  the  "y"  and  adding  an  "les". 


For  verbs,  we  have  the  following: 

The  verb  is  irregular. 

The  third  person  singular  is  formed  by  adding  an  "s"  and 
the  pastpest-partiolple  is  formed  by  adding  an  "ed”.  ^ ' 
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es-ed  The  third  person  singular  Is  formed  by  adding  an  "es"  and 
the  paatpast-partlelplp  is  formed  by  adding  an  "ed". 

For  adjectives  (ADJ)  and  adverbs  (ADV)  we  have  the  following 
possibilities: 

• The  word  has  no  regularly  inflected  comoarative  or 

superlative  (e.g.  UKSUCCESSFUL> . 

ER-EST  The  comparative  and  superlative  are  formed  by  adding  an  EF 
and  EST,  respectively  (e.g.  LONG).  If  the  word  ends  in  a 
Y,  then  it  is  first  changed  to  an  I (e.g.  HAPPY). 

R-ST  Add  an  R or  ST  to  form  the  comparative  and  superlative 

(e.g.  WIDE). 


V.3  Ihs.  Dictionary  Manager 


IlU  Dictionary  MgaagfiC 

The  Dictionary  Manager  is  a collection  of  programs  which  access, 
maintain,  and  update  the  Dictionary.  They  also  act  as  a filter 
between  the  Parser  and  the  user's  input  suggesting  alternative  words 
as  the  'next  word'.  As  each  word  of  the  input  is  needed  by  the 
parser,  the  Dictionary  Manager  checks  to  see  if  there  is  an  entry  in 
the  Dictionary  for  that  word  or  if  the  word  belongs  to  certain 
categories  which  are  recognized  procedurally,  such  as  numbers.  If  an 
entry  exists,  then  the  word  is  passed  on  to  the  parser  with  a list  of 
features  associated  with  the  word.  These  features  are  those  found  in 
the  dictionary  together  with  any  others  suggested  by  the  Morphology 
Specialists. 
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If  the  word  does  not  have  a lexical  entry,  then  a series  of 
Morphology  Specialists  are  Invoked  to  see  If  the  word  Is  a regularly 
Inflected  form  of  a known  word.  These  specialists  use  their 
knowledge  of  typical  English  affixes  to  propose  candidate  'roots'  for 
the  word.  Each  candidate  Is  then  looked  up  In  the  Dictionary  and,  If 
found,  checked  to  see  If  It  acceots  the  affix  removed. 

If  this  process  falls  then  the  Punctuation  Checker  Is  called  which 


examines  the 

word 

to 

see 

If 

It 

might 

contain  any  embedded 

punctuation. 

For 

example. 

If 

the 

Input 

la  TAIL-NUMBER , the 

Punctuation  Checker 

would 

suggest 

that 

the  two 

words  TAIL  and  NUMBER 

be  substituted. 

Finally,  If  the  word  Is  still  unknown,  control  Is  passed  to  the  New 
Word  Learner.  This  module  Interacts  with  the  user  and  attempts  to 
create  a lexical  entry  for  the  word. 

Two  other  subsystems  generate  alternative  suggestions  for  the  'next 
word':  a Compound  Word  Recognizer  and  a Word  Substltuter.  The 
Compound  Word  Recognizer  la  used  to  map  short  phrases  Into  single 
"words".  For  example,  we  might  map  the  phrases  UNITED  STATES  and 
UNITED  STATES  OF  AMERICA  Into  the  single  "word"  USA.  Just  before  a 
new  word  la  passed  to  the  parser,  the  Compound  Word  Recognizer  checks 
to  see  If  that  word  can  begin  a phrase  that  It  knows.  If  so,  then  It 
checks  the  rest  of  the  sentence,  and  If  It  matches,  suggests  the 


alternative  'next  word'. 


The  Word  Substituter  provides  a similar  mechanism  - one  which  can 
expand  a single  word  in  the  input  into  a sequence  words.  For 
example,  the  word  DIDN'T  could  be  expanded  into  the  sequence  DID  NOT. 
The  sequence  of  words  substituted  can  be  of  zero  length,  which 
provides  a facility  for  Jji^noring  words.  This  facility  might  be  used, 
for  example,  to  ignore  extraneous  punctuation. 

V.lt  Auxiliary  fuactlgHf 

Several  functions  are  provided  which  simplify  creating  dictionary 
entries  and  loading  and  manipulating  dictionary  files. 

V.4.a  Creating  Dictionary  Entries 

DEFIMEWORD  DEFW  create  dictionary  gntfisa  iQL  A Ugllg 

(DEFINEWORD  <word>  <deflnitlon>  ) 

(DEFW  •<word>  . *<definltion>  ) 

These  functions  are  used  to  create  a dictionary  entry  for  the  word 
<word>  or  to  add  information  to  the  dictionary  entry  for  <word>  if  an 
entry  is  already  in  the  dictionary. 

The  function  DEFINEWORD  takes  two  arguments:  a word  and  a "property 
list  ffagment".  It  adds  the  property  list  Segment  to  the  property 


list  of  the  word. 


The  function  DEFW  takes  an  Indefinite  number  of  arguments,  none  of 


which  are  evaluated.  The  first  Is  the  word  and  the  remaining  ones 
are  alternating  property  names  and  values.  The  name-value  pairs  are 
added  the  the  word's  property  list. 


Examples  of  these  functions  In  use  are: 

(DEFINEWORD  'fly  '(v  • n -es  features  (Intrans  trans))) 
(DEFINEWORD  '1  '(pro  •)) 

(DEFW  ship  n -s  V -s-ed  features  (trans  passive)) 

(DEFW  ? cunct  «) 


And  DSfP  Qiisals.  Coappuads 


(DEFINEPHRASE  <phrase>  <word>) 
(DEFP  '<phrase>  *<word>) 


These  functions  create  COMPOUNDS  entries  for  a word.  A COMPOUNDS 
entry  will  cause  the  ATN  Interpreter  to  map  a sequence  of  words  In 
the  Input  string  Into  a single  word.  These  functions  will  cause  the 
Input  sequence  <phrase>  (which  should  be  a list  of  atoms)  to  be 
replaced  by  the  single  word  <word>. 


For  example: 

(DEFP  (how  much)  howmuch) 


would  create  or  modify  the  COMPOUNDS  entry  for  the  word  HOW  such  that 
If  the  ATN  Interpreter  encounters  the  two  word  sequence  "HOW  MUCH"  In 
the  Input  string,  It  will  replace  these  two  words  with  the  single 
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word  HOWMUCH.  Note  that  if  subsequent  parsing  fails,  the  interpreter 
will  also  try  parsing  the  string  as  containing  the  two  individual 
words  HOW  and  MUCH. 

Several  COMPOUNDS  entries  can  be  created  at  once  by  including  sub- 
lists in  the  first  argument.  For  example,  (DEFP  (how  (much  many)) 
howmuch)  is  the  equivalent  to: 

(DEFP  (how  much)  howmuch) 

(DEFP  (how  many)  howmuch) 

Thus,  a sublist  in  the  <phrase>  argument  specifies  that  one  of  its 
elements  be  in  the  compound  phrase. 

DEFSUBSTITUTE  ^ DEFSUB  create  SUBSTITUTE  entries 

(DEFSUBSTITUTE  <word>  <phrase>) 

(DEFSUB  '<word>  '<phrase>) 


These  functions  create  or  modify  SUBSTITUTE  properties  for  a word. 
The  first  argument  is  an  atom  and  the  second  argument  is  an  atom  or 
list  of  atoms  which  should  be  substituted  for  the  first  whenever  it 
occurs  in  the  input  string.  The  following  example  will  explain  the 
various  possibilities: 

(DEFSUB  aircraft  plane) 

If  the  second  argument  is  an  atom  (but  not  the  atom  NIL), 
then  it  will  replace  the  first  argument  in  the  input 
string.  Thus,  PLANE  will  replace  Instances  of  the  word 
AIBClUFT  in  the  input  string. 

DCFSUf  don't  (do  not)) 

If  the  second  argument  is  a list  of  atoms,  then  that  list 


76 


will  replace  the  first  argument  in  the  input  string.  Thus, 
occurances  of  the  word  DON'T  will  be  replaced  by  the  two 
words  DO  and  NOT. 

(DEFSUB  please  NIL) 

If  the  second  argument  is  NIL,  than  the  first  argument  will 
be  deleted  from  the  input  string.  Thus  instances  of  the 
word  PLEASE  will  be  deleted  from  the  input  string.  This 
provides  a simple  method  for  Ignoring  certain  words  (such 
as  "extraneous"  punctuation) . 

As  with  the  COMPOUNDS  mechanism,  if  a substitution  is  made  and 
failure  later  backs  up  to  that  point,  the  ATN  interpreter  will 
attempt  to  continue  the  parse  without  making  the  substitution. 

V.4.b  Dictionary  Files 

If  one  is  dealing  with  a large  number  of  words  it  is  convenient  to 
store  them  in  a file  in  a special  simplified  format.  Two  functions 
are  provided  to  manipulate  such  dictionary  files. 

The  format  of  a dictionary  file  is  a sequence  of  lists.  The  first 
element  of  each  list  is  an  atom  or  a list.  If  it  is  an  atom,  then 
the  list  is  taken  as  the  argument  list  to  the  function  DEFW.  In 
other  words,  the  first  element  is  a word  and  the  remaining  elements 
are  alternating  property-names  and  associated  values.  If  the  first 
element  is  a sub-list,  then  the  entire  list  is  used  to  supply 
arguments  to  the  function  DEFP.  In  this  case,  the  first  element  is  a 
phrase  and  the  remaining  element  (there  should  be  only  one)  is  the 


word  that  x'le  phrase  gets  mapped  into.  The  following  might  be  the 


beginning  of  such  a dictionary  file; 

(a  det  *) 

(about  prep  *) 

(an  det  •) 

((and  or)  and-or) 

(and-or  conj  *) 

(ask  V s-ed  features  (trans  indobj  thatcomp  tocomp)) 


LQADDIC  loads  & Dictionary  File 

(LOADDIC  . ’<file  specifications>) 
eg;  (LOADDIC  navy  die  dsk  (1000  130))^ 

The  function  LOADDIC  is  used  to  read  a dictionary  file  and  create  the 
snecified  entries.  Its  syntax  is  similar  to  that  for  the  MacLISP  10 
functions  ("old  10"  functions,  that  is).  It  takes  from  one  to  four 
arguments  where  the  first  is  the  file  name,  the  second  the  file 
extension,  the  third  the  device  and  the  fourth  is  the  PPN. 
Unsupplied  arguments  are  defaulted  as  follows; 


arg 

1 

<flle  name> 

; must  be  supplied 

arg 

2 

<flle  exten8lon> 

; defaults  to  DIC 

arg 

3 

<deviee> 

; defaults  to  current  device 

arg 

4 

<ppn> 

; defaults  to  the  current  ppn 

GBIMDPIC  reformats  ^ dictionary  file 

(GRINDDIC  . '<flle  speclflcatlons>  ) 
eg;  (GRINDDIC  navy  die) 
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This  function  sorts  and  grinds  (l.e.  pretty-prints)  a dictionary 
file.  The  sorting  function  sorts  each  top  level  list  In  the  file  by 
the  "leftmost"  atom  In  the  list  only.  The  argument  syntax  and 
meaning  Is  Identical  to  the  function  LOADDIC.  The  function  GRINDDIC 
Is  usefVil  In  maintaining  growing  dictionaries  in  an  easily  readable 
format . 

Miscellaneous  dlgtionarv  BanaKing  functions  provided 

Currently,  one  other  auxiliary  function  Is  supplied  for  managing  a 
dictionary.  The  function  SHOWWORD  "pretty-prints"  the  dictionary 
entry  for  a word.  It's  syntax  is: 

(SHOWWORD  •<word>  ) 
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VI 

Auxiliary  Functions 


VI. 1 Interfacing  with  the  LISP  Editor 

The  functions  EDITATN  and  EDITSTATE  allow  one  to  apply  the  LISP 
Editor  [Gabriel  75]  to  ATN  networks  and  individual  states.  These 
functions  do  not  normally  reside  in  the  ATN  system  but  are 
automatically  loaded  If  they  are  called.  In  addition,  this  subsystem 
modifies  appropriate  Editor  variables  so  that  the  editor  can 
recognize  ATN  network  definitions  in  files.  Thus,  one  can  say  (EDIT 
<network  name>  <file  speclflcations>)  and  expect  the  editor  to  find 
the  network  definition  in  the  specified  file.  This  also  allows  one 
to  use  the  REFILE  editor  command  to  update  the  file  once  the  networks 
have  been  edited. 

EPIIAIM  aBPiiea  Ihs.  Editor  ^ AUi  network 

This  function  seta  up  an  ATN  network  for  editing  and  then  Invokes  the 

LISP  Editor.  It  takes  one  or  no  arguments,  the  syntax  being: 

(EDITATN  <network  name>} 

(EDITATN  <state  name>) 

(EDITATN  •) 

(EDITATN) 
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If  EDITATN  is  called  with  the  name  of  a network,  then  that  network  is 
set  up  for  editing.  If  called  with  the  name  of  an  individual  state, 
then  the  editor  is  invoked  on  the  network  which  includes  that  state. 
If  called  with  the  argument  • , then  the  network  which  includes  the 
current  state  is  edited.  If  no  argument  is  given,  then  the  network 
which  was  last  edited  is  set  up  for  re-editing. 

EPUSIAIE  editor  ^ au  individual  KIR  state 

The  function  EDITSTATE  is  similar  to  EDITATN  except  that  it  applies 

the  editor  to  an  individual  ATN  state.  If  an  argument  is  given,  it 

can  be  the  name  of  the  state  to  be  edited  or  the  sneclal  atom  * which 

refers  to  the  state  that  the  interpreter  is  currently  in.  If  no 

argument  is  given,  the  editor  is  primed  with  the  last  edited  ATN 

state.  The  syntax  is:" 

• 

(EDITSTATE  ’<state  name>) 

(EDITSTATE  •) 

(EDITSTATE) 

VI  .2  Displaying  ^ networks  ^ atatea 

aHQWATM  displays  au  AUL  network 

The  function  SHOWATN  is  used  to  display  an  entire  ATN  network  in 

"pretty  print"  format.  The  optional  argument  is  interpreted  in  a 

manner  Identical  to  that  for  the  function  EDITATN.  The  options  are: 

(SHOWATN  '<network  naffle>) 

(SHOWATN  '<8tate  naffle>) 

(SHOWATN  •) 


i 

\ 

i 


(showat: 


SHQMSTAIE  diaplavs  ^ individual  AUL  state 

The  SHOWSTATE  function  "pretty  prints"  an  individual  ATN  state.  Its 
optional  argument  is  interpreted  like  that  of  EDITSTATE.  The  syntax 
is: 

(SHOWSTATE  '<state  naffle>) 

(SHOWSTATE  *) 

(SHOWSTATE) 
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APPENDIX  A 

A Sample  ATN  Network 


The  following  three  ATN  networks  comprise  a simple  grammar  to  a small 
subset  of  English.  Note  that  most  features  of  English  syntax  are  not 
handled.  Two  auxiliary  functions,  BUILDSENTENCE  and  BUILDNP,  are 
used  to  construct  the  parse  trees. 


(defatn  sentence 

((s:  (cat  aux  t (setr  v *)  (to  s:aux)) 

(push  np:  t (setr  subj  •)  (to  s:subj)) 

(vlr  np  t (setr  subj  •)  (to  s:subj))) 

(s:aux  (push  np:  t (setr  subj  •)  (to  s:v))) 

(s:subj  (cat  v t (setr  v *)  (to  s:v))) 

(s:v  (cat  V 

(getf  pastpart) 

(cond  ((eg  $v  'be) 

(hold  $subj  'np  nil) 

(setr  subj  'someone)) 

((eg  $v  'have) 

(setr  aspect  'perfect)) 

(t  (fall  'irc))) 

(setr  V •) 

(to  s:v)) 

(push  np:  (getf  transatlve  $v)  (setr  obj  •)  (to  s:obJ)) 
(vlr  np  (getf  transatlve  $v)  (setr  obj  *)  (to  s:obj)) 
(and  (wrd  to  (getf  strans  $v)) 

(push  s:subj 
t 

(sendr  subj  |obj) 

(setr  obj  •) 

(to  s:end))) 

(and  (wrd  that  (getf  strans  $v)) 

(push  s:  t (setr  obj  •)  (to  s:end))) 

(jump  s:end  (getf  Intrans  $v))) 

(s:obj  (push  np: 

(getf  Indobj  $v) 

(setr  Indobj  $obj) 

(setr  obj  •) 

(to  s:end)) 

(vlr  np 
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(getf  indobj  $v) 

(setr  indobj  $obj) 

(setr  obj  •) 

(to  8 : end ) ) 

(and  (wrd  to) 

(push  s:subj 
t 

(sendr  subj  $obJ) 

(setr  obJ  *) 

(to  s:end))) 

(Jump  s:end)} 

(s:end  (push  pp;  (cat  prep)  (addr  pps  *)) 
(pop  ( buildsentence) ) ) ) ) 


1- 


i 


(defatn  noun-phrase 

((np;  (cat  det  t (setr  det  •)  (to  np:det))  (Jump  nprdet)) 
(np;det  (cat  adj  t (addr  adj  *)  (to  nprdet)) 

(cat  n t (addr  adj  *)  (to  np:det)) 

(cat  n t (setr  n •)  (to  np:n))) 

(np:n  (push  pp;  (cat  prep)  (addr  modifiers  *)  (to  np:n)) 
(and  (wrd  (which  who  that  whom) 
t 

(hold  (buildnp)  'np  nil)) 

(push  s:  t (addr  modifie»''s  *)  (to  nptend))) 
(Jump  np;end)) 

(nptend  (pop  (buildnp))))) 

(defatn  prep- phrase 

((pp:  (cat  prep  t (setr  prep  •)  (to  pptprep))) 

(pptprep  (push  np:  t (setr  np  •)  (to  pptend)) 

(vir  np  t (setr  np  •)  (to  pptend))) 

(pptend  (pop  (buildq  (pp  +)  prep  np))))) 

(defun  buildsentence  nil 

(buildq  (s  (verb  -f)  (subj  +)  (obj  +)  (indobj  +)  (pps  +)) 

V 

subJ 

obJ 

indobj 

(defun  buildnp  nil 

(buildq  (np  (n  +)  (det  +)  (S  (adj)  •♦■)  (§  (modifiers)  +)) 
n 

det 

adj 

modifiers) ) 


I 
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APPEKDi::  c 

Examples  of  Operation 


I 

? [1 

i n 

t 

fi 

n 
n 
^ n 

i n 


y 
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Given  below  are  the  results  of  applying  the  example  ATN  network  to 
six  sentences.  For  each,  the  sentence  is  given  exactly  as  it  was 
typed,  preceded  by  a Following  that,  the  list  POPed  by  the  ATN 

network  is  given. 

» which  planes  required  engine  maintenance? 

(S  (VERB  REQUIRE) 

(SUBJ  (NP  (N  PLANE)  (DET  WHICH)  (ADJ)  (MODIFIERS))) 

(OBJ  (NP  (N  MAINTENANCE)  (DET  NIL)  (ADJ  ENGINE)  (MODIFIERS))) 
(INDOBJ  NIL) 

(PPS  NIL)) 

>>  electrical  repairs  were  required. 

(S  (VERB  REQUIRE) 

(SUBJ  SOMEONE) 

(OBJ  (NP  (N  REPAIR)  (DET  NIL)  (ADJ  ELECTRICAL)  (MODIFIERS))) 
(INDOBJ  NIL) 

(PPS  NIL)) 

>>  I want  a list  of  those  planes 
(S  (VERB  WANT) 

(SUBJ  (NP  (N  I)  (DET  NIL)  (ADJ)  (MODIFIERS))) 

(OBJ  (NP  (N  PLANE)  (DET  THOSE)  (ADJ)  (MODIFIERS))) 

(INDOBJ  (NP  (N  YOU) 

(DET  NIL) 

(ADJ) 

(MODIFIERS  (PP  TO 

(NP  (N  LIST) 

(DET  NIL) 

(ADJ) 

(MODIFIERS)))))) 

(PPS  NIL)) 


>>  I want  you  to  list  those  planes 


1 


(s  (VERB  vr.r) 

(SUBJ  (NP  (N  I)  (DET  NIL)  (ADJ)  (MODIFIERS))) 

(OBJ  (S  (VERB  LIST) 

(SUBJ  (NP  (N  YOU)  (DET  NIL)  (ADJ)  (MODIFIERS))) 
(OBJ  (NP  (N  PLANE)  (DET  THOSE)  (ADJ)  (MODIFIERS))) 
(INDOBJ  NIL) 

(PPS  NIL))) 

(INDOBJ  NIL) 

(PPS  NIL)) 

» I want  to  see  a list  of  the  planes  that  required 
extensive  engine  repairs. 


(S 

(VERB  WANT) 

(SUBJ  (NP  (N  I)  (DET  NIL)  (ADJ)  (MODIFIERS))) 

(OBJ 

(S  (VERB  SEE) 

(SUBJ  (NP  (N  I)  (DET  NIL)  (ADJ)  (MODIFIERS))) 

(OBJ  (NP  (N  LIST) 

(DET  A) 

(ADJ) 

(MODIFIERS  (PP  OF 

(NP  (N  PLANE) 

(DET  THE) 

(ADJ) 

(MODIFIERS  (S  (VERB  REQUIRE) 

(SUBJ  (NP  (N  PLANE) 
(DET  THE) 
(ADJ) 

(MODIFIERS))) 
(OBJ  (NP  (N  REPAIR) 
(DET  NIL) 
(ADJ 

EXTENSIVE 
ENGINE) 
(MODIFIERS))) 
(INDOBJ  NIL) 

(PPS  NIL)))))))) 

(INDOBJ  NIL) 

(PPS  NIL))) 

(INDOBJ  NIL) 

(PPS  NIL)) 


>>  I want  you  to  show  me  the  planes  which  have  poor 
maintenance  records. 

(S  (VERB  WANT) 

(SUBJ  (NP  (N  I)  (DET  NIL)  (ADJ)  (MODIFIERS))) 
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(OBJ  (S  (VERB  SKO.  ) 

(SUBJ  (NP  (N  YOU)  (DET  NIL)  (ADJ)  (MODIFIERS))) 

(OBJ  (NP  (N  PLANE) 

(DET  THE) 

(ADJ) 

(MODIFIERS  (S  (VERB  HAVE) 

(SUBJ  (NP  (N  PLANE) 

(DET  THE) 

(ADJ) 

(FWDIFIERS))) 

(OBJ  (NP  (N  RECORD) 

(DET  NIL) 

(ADJ  POOR  MAINTENANCE) 
(MODIFIERS))) 

(INDOBJ  NIL) 

(PPS  NIL))))) 

(INDOBJ  (NP  (N  I)  (DET  NIL)  (ADJ)  (MODIFIERS))) 

(PPS  NIL))) 

(INDOBJ  NIL) 

(PPS  NIL)) 
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II 


APPENDIX  D 

Examples  of  tracing  States 


I 


The  following  example  shows  a trace  of  the  action  of  the  ATN  network 


In  parsing  a sentence.  This  trace  was  generated  by  evaluating 
(TRACE-STATES  *).  Again,  the  Input  string  Is  preceded  by  "»"  and 
the  final  value  FOPed  by  the  network  Is  displayed  at  the  end.  Values 
POPed  by  final  states  have  been  abbreviated. 


» Which  aircraft  have  very  poor  maintenance  records? 

In  state  S: 
pushing  to  state  NP: 

In  state  NP: 

in  state  NP:DET 

In  state  NP:DET 

failing  from  state  NP:DET  “ 

’ In  state  NP:N 
Jump  to  state  NP:BND 
In  state  NP:END 

pop  from  state  NP:  with  value  (NP  (N  PLANE)  ...  ) 
Jump  to  state  S:S(JBJ 
In  state  S:SUBJ 
In  state  S:V 
pushing  to  state  NP: 
in  state  NP: 

Jiap  to  state  NP:DET 
in  state  NP:DET 
in  state  NP:DET 
in  state  i(P:DET 
in  state  NP:DET 
in  state  NP:DET 
failing  tram  state  NP:0ET 
in  state  NP:DET 
failing  from  state  NP:DET 
in  state  NF:N 
Jusp  to  state  NP:BND 
in  state  1IP:EIID 

pop  froa  state  NP:  with  value  (NP  (N  RECORD)  ...  ) 
Jnap  to  state  8:0BJ 


.r 
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in  state  S:OBJ 
Junp  to  state  S:END 
In  state  S:END 

pop  from  state  TOPLEVEL  with  value  (S  (VERB  HAVE)  ...  ) 
(S  (VERB  HAVE) 

(SUBJ  (HP  (N  PLANE)  (DET  WHICH)  (ADJ)  (MODIFIERS))) 
(OBJ  (NP  (N  RECORD) 

(DET  NIL) 

(ADJ  VERY  POOR  MAINTENANCt. ) 

(MODIFIERS))) 

(INDOBJ  KID 
(PPS  NIL)) 


<« 


3 
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APPEKDi::  E 

Examples  of  Tracing  Registers 

This  example  shows  a trace  of  the  values  assigned  to  ATN  registers  as 
well.  This  was  achieved  by  evaluating  (TRACE-STATES  and  (TRACE- 
REG  *). 


» I want  you  to  list  those  planes 

in  state  S: 
pushing  to  state  NP: 
in  state  NP: 

Jump  to  state  NP:DET 

in  state  NP:DET 

setting  register  ADJ  to  (I) 

in  state  NP:DET 

failing  from  state  NP:0ET 

setting  register  N to  I 

in  state  l(P:N 

Jump  to  state  NP:END 

in  state  NP:END 

pop  from  state  NP:  with  value  (NP  (NX)  ...  ) 
setting  register  SUBJ  to  (NP  (NX)  ...  ) 

Jump  to  state  S:SUBJ 
in  state  S:SUBJ 
setting  register  V to  WANT 
in  state  S:V 
pushing  to  state  NP: 
in  state  NP: 

Jump  to  state  NP:DET 
in  state  NP:OBT 

setting  register  ADJ  to  (TOU) 

in  state  NP:DET 

falling  from  state  NP:DET 

setting  register  N to  YOU 

in  state  NP:N 

Juip  to  state  NP:EKD 

in  state  NP:EI(I> 

pep  from  state  NP:  with  value  (NP  (N  TOU)  ...  ) 
setting  register  OBJ  to  (NP  (N  YOU)  ...  ) 

Jump  to  state  S:0BJ 
in  state  S:0BJ 
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push Inc  to  state  KP: 
in  state  NP: 

Jump  to  state  NP:DET 
In  state  NP:DET 
failing  n*offl  state  NP:DET 
failing  ft*om  state  NP: 
sending  register  SUBJ  to  (NP  (N  YOU)  ...  ) 
pushing  to  state  S:SUBJ 
in  state  StSVBJ 
setting  register  V to  LIST 
in  state  S:V 
pushing  to  state  NP: 
in  state  NP: 

setting  register  DET  to  THOSE 
in  state  NP:DET 

setting  register  AOJ  to  (PLANE) 

in  state  NP:DET 

falling  from  state  NP:DET 

in  state  NP:DET 

falling  ft*oa  state  NP:DET 

in  state  NP:DET 

failing  from  state  NP:OET 

setting  register  N to  PLANE 

in  state  NP:N 

Jump  to  state  NP:END 

in  state  NP:END 

pop  from  state  NP:  with  value  (NP  (N  PLANE)  ...  ) 
setting  register  OBJ  to  (NP  (N  PLANE)  (DET  THOSE)  ...  ) 
Jump  to  state  S:OBJ 
in  state  S:OBJ 
Jump  to  state  S:END 
in  state  S:ENO 

pop  from  state  S:SUBJ  with  value  (S  (VERB  LIST)  ...  ) 
setting  register  OBJ  to  (S  (VERB  LIST)  ...  ) 
juap  to  state  8: END 
in  state  S:BND 

pop  frost  state  TOPLEVEL  with  value  (S  (VERB  WANT)  ...  ) 

(S  (VERB  WANT) 

(SOBJ  (NP  (N  :)  (DET  NIL)  (AOJ)  (MODIFIERS))) 

(OBJ  (S  (VERB  LIST) 

(SOBJ  (NP  (N  TOO)  (DET  NIL)  (ADJ)  (MODIFIERS))) 
(OBJ  (NP  (N  PLANE)  (DET  THOSE)  (ADJ)  (MODIFIERS))) 
(INOOBJ  NIL) 

(PPS  NIL))) 

(INOOBJ  NIL) 

(PPS  NIL)) 
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fINITIALSTATE  15 

AGAIN  15,  16 
ALIST  49 
ALIST-STACK  49 
AND  arc  25 
ARC  4S 
ARC-STACK  50 
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ATNFAILPUSH?  65 
ATNFAILSTATE?  65 
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BREAKPOINT  46 
BUILDQ  43 
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CAT  function  40 
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COMPOUNDS  74 
COMPOUNDS  proparty  68 


DEFARC  27 
DEFATN  26 
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DtFSUBSTITUTE  75 
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PAIL  action  65 
FAIL  arc  22,  65 
PAIL  function  35 
FEATURES  24,  39 
FEATURES  property  69 

(2TF  41,  69 
GETR  31 

HOLD  action  3- 

HOLD  list  21,  23,  37,  38,  39 

initial  state  15 

JUMP  arc  22 
JUMP  function  34 


[] 

n 


LEX  48 

lexical  categories  68 
LIFTLIST  31 
LIFTR  30 
LISP  Editor  79 


MaoLISP  compiler  51 

NEXTVRD  45 
NULLR  32 

PARSE  14 
PARSE1  14 
PHRASE  arc  19 
POP  arc  20 
POPI  arc  21 
POPATH  35 
POPSTACK  50 
post-actions  20 
preaotions  20 
Punctuation  72 
POSH  arc  20 
PUSHATN  34 

RBAOTEXT  14,  16 
register  29 
RISUME  36 
RSSOMETAG  36 
^‘.-'ROOT  arc  18 
‘ “ ROOT  function  41 

SEIXXST  30 
SniDR  30 
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SETR  29 

SHOWATN  80 

SHOWSTAIE  81 

STACK  48 , 49 

STATE  49 

STRING  48 

sub-conputatlon  20 

SUBSTITUTE  properties  75 

SUBSTITUTE  property  69 

syntax  fraae  5 

TO  arc  22 
TO  function  33 
TRACE-REG  47 
TRACE-STATE  45 
tracing  facilities  45 
TST  arc  24 

UNTRACE-REG  47 
UNTRACE-STATE  46 
USE-ALL-WOROS?  21 , 49 

VIR  arc  38 
virtual  arc  23 

WRD  arc  17 
WRD  function  40 
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